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Preface 



Proof theory has long been established as a basic discipline of mathematical 
logic. It has recently become increasingly relevant to computer science. The de- 
ductive apparatus provided by proof theory has proved useful for metatlreoretical 
purposes as well as for practical applications. Thus it seemed to us most natural 
to bring researchers together to assess both the role proof theory already plays 
in computer science and the role it might play in the future. 

The form of a Dagstuhl seminar is most suitable for purposes like this, as 
Schlofi Dagstuhl provides a very convenient and stimulating environment to di- 
scuss new ideas and developments. To accompany the conference with a procee- 
dings volume appeared to us equally appropriate. Such a volume not only fixes 
basic results of the subject and makes them available to a broader audience, but 
also signals to the scientific community that Proof Theory in Computer Science 
(PTCS) is a major research branch within the wider field of logic in computer 
science. 

Therefore everybody invited to the Dagstuhl seminar was also invited to 
submit a paper. However, preparation and acceptance of a paper for the volume 
was not a precondition of participating at the conference, since the idea of a 
Dagstuhl seminar as a forum for spontaneous and open discussions should be 
kept. Our idea was that the papers in this volume should be suitable as starting 
points for such discussions by presenting fundamental results which merit their 
publication in the Springer LNCS series. The quality and variety of the papers 
received and accepted rendered this plan fully justified. They are a state-of-the- 
art sample of proof-theoretic methods and techniques applied within computer 
science. 

In our opinion PTCS focuses on the impact proof theory has or should have 
on computer science, in particular with respect to programming. Major divisions 
of PTCS, as represented in this volume, are the following: 

1. The proofs as programs paradigm in general 

2. Typed and untyped systems related to functional programming 

3. Proof-theoretic approaches to logic programming 

4. Proof-theoretic ways of dealing with computational complexity 

5. Proof-theoretic semantics of languages for specification and programming 

6. Foundational issues 

This list is not intended to be exclusive. For example, there is undoubtedly 
some overlap between Automated Deduction and PTCS. However, since Auto- 
mated Deduction is already a well-established subdiscipline of logic in computer 
science with its own research programs, many of which are not related to proof 
theory, we did not include it as a core subject of PTCS. 

In the following, we briefly address the topics of PTCS mentioned and indi- 
cate how they are exemplified in the contributions to this volume. 
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1. The most intrinsic relationship between proof theory and computer science, 
if proof theory is understood as the theory of formal proofs and computer science 
as the theory of computing, is provided by the fact that in certain formalisms 
proofs can be evaluated (reduced) to normal forms. This means that proofs can 
be viewed as representing a (not necessarily deterministic) program for their own 
evaluation. In particular contexts they allow one to extract valuable information, 
which may be given, e.g., in the form of particular terms. The idea of considering 
proofs as programs, which in the context of the typed A-calculus is known as 
the Curry-Howard-correspondence, is a research program touched upon by most 
contributions to this volume. The papers by Baaz & Leitsch and by Berger 
are directly devoted to it. Baaz & Leitsch study the relative complexity of two 
cut elimination methods and show that they are intrinsically different. Berger 
investigates a proof of transfinite induction given by Gentzen in order to extract 
algorithms for function hierarchies from it. 

2. Functional programming has always been at the center of interest of proof 
theory, as it is based on the A-calculus. Extensions of the typed A-calculus, in 
particular type theories, lead to powerful frameworks suitable for the formaliza- 
tion of large parts of mathematics. The paper by Alt & Artemov develops a 
reflective extension of the typed A-calculus which internalizes its own derivati- 
ons as terms. Dybjer & Setzer show how indexed forms of inductive-recursive 
definitions, which would enable a certain kind of generic programming, can be 
added to Martin-Lof type theory. The main proof-theoretic paradigm competing 
with type theory is based on type-free applicative theories and extensions thereof 
within Feferman’s general program of explicit mathematics. In his contribution, 
Studer uses this framework in an analysis of a fragment of Java. In particular, 
he manages to proceed without impredicative assumptions, thus supporting a 
general conjecture by Feferman. 

3. Logic programming , which uses the Horn clause fragment of first-order logic 
as a programming language, is a natural topic of PTCS. Originally it was not 
developed within a proof-theoretic framework, and its theoretical background is 
often described in model-theoretic terms. However, it has turned out that a proof- 
theoretic treatment of logic programming is both nearer to the programmer’s way 
of thinking and conceptually and technically very natural. It also leads to strong 
extensions, including typed ones which combine features of functional and logic 
programming. In the present volume, Elbl uses proof-theoretic techniques to give 
“metalogical” operators in logic programming an appropriate rendering. 

4. The machine-independent characterization of classes of computational com- 
plexity not involving explicit bounds has recently gained much attention in proof 
theory. One such approach, relying on higher-type functionals, is used in Aehlig 
et al.’s paper to characterize the parallel complexity class NC. Another proof- 
theoretic method based on term rewriting is applied by Oitavem in her charac- 
terization of PSPACE and is compared and contrasted with other implicit cha- 
racterizations of this class. Gordeew asks the fundamental question of whether 
functional analysis may serve as an alternative framework in certain subjects 
of PTCS. He suggests that non-discrete methods may provide powerful tools 
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for dealing with certain computational problems, in particular those concerning 
polynomial-time computability. 

5. Besides the systematic topics mentioned, the study of specific languages is 
an important aspect of PTCS. In his contribution, Schmitt develops and studies 
a language of iterate logic as the logical basis of certain specification and mo- 
deling languages. Studer gives a denotational semantics of a fragment of Java. 
By interpreting Featherweight Java in a proof-tlreoretically specified language 
he shows that there is a direct proof-theoretic sense of denotational semantics 
which differs both from model-theoretic and from domain-theoretic approaches. 
This shows that the idea of proof-theoretic semantics discussed in certain areas 
of philosophical and mathematical logic, is becoming fruitful for PTCS as well. 

6. Finally, two papers concern foundational aspects of languages. Baaz & 
Fermiiller show that in formalizing identity, it makes a significant difference 
for uniform provability, whether identity is formulated by means of axioms or by 
means of a schema. The paper by Dosen & Petrie presents a coherence result for 
categories which they call “sesquicartesian” , contributing new insights into the 
equality of arrows (and therefore into the equality of proofs and computations) 
and its decidability. 

We thank the authors and reviewers for their contributions and efforts. We 
are grateful to the Schlofi Dagstuhl conference and research center for acting as 
our host, and to Springer- Verlag for publishing these proceedings in their LNCS 
series. 

The second and the third editor would like to add that it was Reinlrard 
Kahle’s idea to organize a Dagstuhl seminar on PTCS, and that he had the 
major share in preparing the conference and editing this volume. He would have 
been the first editor even if his name had not been the first alphabetically. 
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Abstract. A typed lambda calculus with recursion in all finite types is 
defined such that the first order terms exactly characterize the parallel 
complexity class NC. This is achieved by use of the appropriate forms 
of recursion (concatenation recursion and logarithmic recursion), a 
ramified type structure and imposing of a linearity constraint. 

Keywords: higher types, recursion, parallel computation, NC, lambda 
calculus, linear logic, implicit computational complexity 



1 Introduction 

One of the most prominent complexity classes, other than polynomial time, is 
the class NC of functions computable in parallel polylogarithmic time with a 
polynomial amount of hardware. This class has several natural characterizations 
in terms of circuits, alternating Turing machines, or parallel random access ma- 
chines as used in this work. It can be argued that NC is the class of efficiently 
parallalizable problems, just as polynomial time is generally considered as the 
correct formalization of feasible sequential computation. 

Machine-independent characterizations of computational complexity classes 
are not only of theoretical, but recently also of increasing practical interest. 
Besides indicating the robustness and naturalness of the classes in question, 
they also provide guidance for the development of programming languages EH- 

* Supported by the DFG Graduiertenkolleg “Logik in der Informatik” 

** Supported by the DFG Emmy Noether-Programme under grant No. Jo 291/2-1 
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* Supported by a Marie Curie fellowship of the European Union under grant no. ERB- 
FMBI-CT98-3248 

R. Kahle, P. Schroeder-Heister, and R. Stark (Eds.): PTCS 2001, LNCS 2183, pp. 1-^^ 2001. 

(c) Springer- Verlag Berlin Heidelberg 2001 
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The earliest such characterizations, starting with Cobham’s function algebra 
for polynomial time 0, used recursions with explicit bounds on the growth of 
the defined functions. Function algebra characterizations in this style of parallel 
complexity classes, among them NC, were given by Clote 0 and Allen [I]. 

More elegant implicit characterizations, i.e., without any explicitly given 
bounds, but instead using logical concepts like ramification or tiering, have been 
given for many complexity classes, starting with the work of Bellantoni and Cook 
S3 and Leivant g3| on polynomial time. In his thesis gj, Bellantoni gives such 
a characterization of NC using a ramified variant of Clote’s recursion schemes. 
A different implicit characterization of NC, using tree recursion, was given by 
Leivant EE and refined by Bellantoni and Oitavem gj. Other parallel complex- 
ity classes, viz. parallel logarithmic and polylogarithmic time, were given implicit 
characterizations by Bellantoni j^j. Bloch Q and Leivant and Marion Ca- 
in order to apply the approach within the functional programming paradigm, 
one has to consider functions of higher type, and thus extend the function alge- 
bras by a typed lambda calculus. To really make use of this feature, it is desirable 
to allow the definition of higher type functions by recursion. Higher type recur- 
sion was originally considered by Godel tXEj for the analysis of logical systems. 
Systems with recursion in all finite types characterizing polynomial time were 
given by Bellantoni et al. [Sj and Hofmann PE based on the first-order system 
of Bellantoni and Cook g| . 

We define an analogous system that characterizes NC while allowing an ap- 
propriate form of recursion, viz. logarithmic recursion as used by Clote [Bj and 
Bellantoni ^], in all finite types. More precisely, our system is a typed lambda 
calculus which allows two kinds of function types, denoted a — ° r and a — > r, 
and two sorts of variables of the ground type i, the complete ones in addition 
to the usual ones, which are called incomplete for emphasis. A function of type 
(x — >- t can only be applied to complete terms of type a, i.e., terms containing 
only complete free variables. 

It features two recursion operators LR and CR, the latter corresponding to 
Clote’s 5 concatenation recursion on notation, which can naturally only be 
applied to first-order functions. The former is a form of recursion of logarithmic 
length characteristic of all function algebra representations of NC, and here can 
be applied to functions of all linear types, i.e., types only built up using t and — °. 
The function being iterated, as well as the numerical argument being recurred 
on have to be complete, i.e., the type of LR is a — ° ((,—>■ cr — o a) — > t — > cr for 
linear cr. 

Our analysis clearly reveals the different roles played by the two forms of 
recursion in characterizing NC: Logarithmic recursion controls the runtime, in 
that the degree of the polylogarithm that bounds the runtime depends only on 
the number of occurrences of LR. On the other hand, concatenation recursion is 
responsible for parallelism; the degree of the polynomial bounding the amount 
of hardware used depends only on the number of occurrences of CR (and the 
number of occurences of the constant #.) 



Linear Ramified Higher Type Recursion and Parallel Complexity 



3 



The crucial restriction in our system, justifying the use of linear logic nota- 
tion, is a linearity constraint on variables of higher types: all higher type variables 
in a term must occur at most once. 

The main new contribution in the analysis of the complexity of the system 
is a strict separation between the term, i.e. , the program, and the numerical 
context, i.e., its input and data. Whereas the runtime may depend polynomially 
on the former, it may only depend polylogarithmically on the latter. 

To make use of this conceptual separation, the algorithm that unfolds re- 
cursions computes, given a term and context, a recursion- free term plus a new 
context. In particular, it does not substitute numerical parameters, as this would 
immediately lead to linear growth, but only uses them for unfolding; in some 
cases, including the reduction of CR, it extends the context. This way, the growth 
of terms in the elimination of recursions is kept under control. In earlier systems 
that comprised at least polynomial time this strict distinction was not necessary, 
since the computation time there may depend on the input superlinearly. Note 
that any reasonable form of computation will depend at least linearly on the size 
of the program. 

A direct extension to higher types of the first-order system of Bellantoni 0 
would have a constant for concatenation recursion of linear type 
This causes problems in our analysis because the amount of hardware required 
depends exponentially on the number of CR in a term, thus we must not allow 
duplications of this constant during the unfolding of LR. The only way to avoid 
this is by giving CR the more restrictive typ (t — > l) — ° i — > i. This weaker form 
of concatenation recursion nevertheless suffices to include all of NC, when the 
set of base functions is slightly extended. 

Finally, in order to be able to handle numerals in parallel logarithmic time, 
we use a tree data structure to store numerals during the computation. Whereas 
trees are used as the principal data structure in other characterizations of parallel 
complexity classes [llblltij . our system works with usual binary numerals, and 
trees are only used in the implementation. 



2 Clote’s Function Algebra for NC 

Clote 0 gives a function algebra characterization of NC using two recursion 
schemes. The class A is defined as the least class of functions that contain the 
constant 0, projections 7r”(xi,... ,x n ) = Xj, the binary successors sq, si, bit 
test bit, binary length \x\ := |~log 2 (x + 1)], and # where x#y = and is 

closed under composition and the following two forms of recursion: 

A function / is defined by concatenation recursion on notation (CRN) from 
functions g,h,Q,h\ if 



/(0, #) = 5 (x>) 

/(Si(2/),#) = s h .( y ^)(f(y,x>)) , 
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and / is defined from g,ho,hi and r by weak bounded recursion on notation 
(WBRN) if there is F such that 

F(0, #) = <?(#) 

F(si(y),x>) = hi(y, 3? , F(y, x*)) 

F{y,x>) < r{y, #) 

/(y,#) = F(\y\ ,#) . 



Theorem 1 (Clote |8j). A number-theoretic function f is in A if and only if 
f is in NC. 

It is easy to see that the recursion scheme WBRN can be replaced by the 
following scheme: / is defined from g , h and r by bounded logarithmic recursion 
if 



/( 0 , #) = <?(#) 

f{y,x*) = h(y, x>,f(H(y),x>)) for y > 0 

f(y,x>) < r{y,x>) , 



where H(n) := 2 \\n\/i\ \ ^ ias about half the length of n. Both forms 

of recursion produce log \y\ iterations of the step function. We shall denote the 
function algebra with bounded logarithmic recursion by A as well. 



3 Formal Definition of the System 

We use simple types with two forms of abstraction over a single base type i, i.e., 
our types are given by the grammar 

a,T ::= l \ a t \ a — > t , 

and we call the types that are built up from i and — ° only the linear types. 

As the intended semantics for our base type are the binary numerals we have 
the constants 0 of type 1 and s 0 and Si of type l l. Moreover we add constants 
len of type l —° l and bit of type l 1 l for the corresponding base functions 

of A. 

The functionality of the base function ff is split between two constants, a 
unary ff of type t — > t to produce growth, and sm of type 1 1 1 l that 

performs the multiplication of lengths without producing growth. The intended 
semantics, reflected in the conversion rules below, is ffn = 2^ and sm{w , a, b ) = 
2l°M 6 l mod2M. 

In order to embed A into the system, we need two more constants drop 
of type l —o i — o t and half of type r — o t, intended to denote the functions 
drop{n,m) = |_^tJ an d H, respectively. 
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We allow case-distinction for arbitrary types, so we have a constant d a of 
type l — ° (7 — o g — o (j for every type <j. Recursion is added to the system via 
the constant LR and parallelism via the constant CR. Their types are 

CR : (t — t C) — ° l — y t 

LR a : (t — o ( l — ^ (t — o (j ) — ^ i — y a for cr linear 

Terms are built from variables and constants via abstraction and typed appli- 
cation. We have incomplete variables of every type, denoted by x, y, .. . and 
complete variables of ground type, denoted by x, y, . . . . All our variables and 
terms have a fixed type and we add type superscripts to emphasize the type: x' 7 , 
x l , t a . 

Corresponding to the two kinds of function types, there are two forms of 
abstraction 



(A x°.t T Y^ T and (Ax t .f' r ) t->r 

and two forms of application 

(t a ^ T s a ) T and (t^ T s a ) T , 

where in the last case we require s to be complete, and a term is called complete 
if all its free variables are. It should be noted that, although we cannot form 
terms of type a — > r with a ^ i directly via abstraction, it is still important to 
have that type in order to express, for example, that the first argument of LR 
must not contain free incomplete variables. 

In the following we omit the type subscripts at the constants d CT and LR CT if 
the type is obvious or irrelevant. Moreover we identify a-equal terms. As usual 
application associates to the left. A binary numeral is either 0, or of the form 
Sp (. . . (s.i^SiO))). We abbreviate the binary numeral (siO) by 1. 

The semantics of l as binary numerals (rather than binary words) is given 
by the conversion rule So 0 H > 0. In the following definitions we identify binary 
numerals with the natural number they represent. The base functions get their 
usual semantics, i.e., we add conversion rules len n i — > |n|, drop nm >->• drop(n, m ), 
halfn i — ^ H(n), bit ni i — > \_%\ mod 2, smt vmn *— > sm(w,m,n). Moreover, we 
add the conversion rules 



d (j 0 




Xx a y G . x 


d<x (s in) 




Xxq x1 . Xi 


# n 




So'" |2 l 


CRhO 




0 


CR h (sj n) 


i — y 


dp-ot) (h(s. l n))s Q s 1 (CRhn) 


LRghO 
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h n (LR g h (half n)) 




K. Aehlig et al. 



Here we always assumed that n, in and s t n are binary numerals, and in particular 
that the latter does not reduce to 0. In the last rule, n has to be a binary numeral 
different from 0. 

As usual the reduction relation is the closure of H > under all term forming 
operations and equivalence is the symmetric, reflexive, transitive closure of the 
reduction relation. As all reduction rules are correct with respect to the intended 
semantics and obviously all closed normal terms of type l are numerals, closed 
terms t of type t have a unique normal form that we denote by t ni . 

As usual, lists of notations for terms /numbers/ . . . that only differ in suc- 
cessive indices are denoted by leaving out the indices and putting an arrow 
over the notation. It is usually obvious where to add the missing indices. If not 
we add a dot wherever an index is left out. Lists are inserted into formulae 
“in the natural way”, e.g., hm. = hm\, . . . , hmk and x 7* = {{xti) . . . tk) and 
Ifll + l s l = Iffl + l s i| + • ■ • + |sfc|- Moreover, by abuse of notation, we denote lists 
consisting of maybe both, complete and incomplete variables also by w. 

As already mentioned, we are not interested in all terms of the system, but 
only in those fulfilling a certain linearity condition. 

Definition 1. A term t is called linear, if every variable of higher type in t 
occurs at most once. 

Since we allow that the variable x does not occur in Xx.t, our linear terms should 
correctly be called affine, but we keep the more familiar term linear. 



4 Completeness 

Definition 2. A term t : T* — > t denotes the function f(x*) if for every rtf, 
trif reduces to the numeral /(rr). 

We will sometimes identify a term with the function it denotes. 

In order to prove that our term system can denote all functions in NC, we 
first have to define some auxiliary terms. We define ones := CR(Ax.l), then we 
have that onesn = 2^ — 1, i.e., a numeral of the same length as n consisting of 
ones only. We use this to define 

<e := Ay b . bit (onesy) (len b) , 

so that <t mn is the characteristic function of \m\ < \n\. We will write <e infix 
in the following. It is used to define 

max; := Aa b . d (a <i b)b& 

computing the longer of two binary numerals. 

Next we define rev := Aa\CR(Ai.bita;i), so that revmn returns the |n| least 
significant bits of m reversed. Finally we define the binary predecessor as p := 

Ax.dropxl. 
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Theorem 2. For every function f(x*) in A. there is a closed linear term tf of 
type T* — > t that denotes f . 



Proof. The proof follows the lines of Bellantoni’s |2| completeness proof for his 
two-sorted function algebra for NC. 

We will use the following fact: for every function / G A there is a polynomial 
qf such that for all n f , |/(n > )| < g/(|n| ). To prove the theorem, we will prove 
the following stronger claim: 

For every f(x*) G A, there is a closed linear term tf of type t — > T* — ° t 
and a polynomial pf such that for every nf, tf wnf reduces to f(n r) for 
all w with |w| > Pf{\n\ ). 



The claim implies the theorem, since by use of the constant ff and the term 
max;, we can define terms Wf : T* — > i such that for all nf, \ Wf nf\ > Pf(\n\ ). 

We prove the claim by induction on the definition of / in the function algebra 
A with bounded logarithmic recursion. 

If / is any of the base functions 0, s,, |.| , bit , then we let tf := Aw .c where c is 
the corresponding constant of our system, and for / = 7r" we let tf := Aw xf.xj. 
In these cases we can set pf = 0, and the claim obviously holds. 



If / is #, then we set tf := Aw.smw. It holds that tfwab = affb as long 
as |a| • \b\ < |ui|, so we set pf(x, y) = x ■ y + 1. 



If / is defined by composition, f(x*) = h(g{xf) ), then by induction we have 
terms th,t g and polynomials ph,pl/. We define tf := Aw^.^w^w#) and 
Pf(sr) := Ph(dg{x*) ) + Pg{x*) . The claim follows easily from the induction 
hypothesis. 



Now let / be defined by CRN from g , h 0 , hi, and let t g , tf H be given by 
induction. First we define a function h that combines the two step functions into 
one, by 



h := Aw y . d y (t ho w (p y)) (t hl w (p y)) 

then we use this to define a function f that computes an end-segment of f(y , x*) 
reversed, using CR, by 

aux := Aw ^z.(i(z <;(/) (h w (drop y (pz)) x*) 

(bit (t s w^) (\z\ - \y\ - 1)) 

/' := Aw y x * . CR (auxw yx*) , 

where \z\ — \y\ — 1 is computed as len (dropz (si y)). Finally, the computed value 
is reversed, and tf is defined by 

tf := Aw y x * . rev (f' w y uC w) w . 
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In order for this to work, w has to be large enough for g and the hi to be 
computed correctly by the inductive hypothesis, thus pf needs to maximize p g 
and the pt H . Also, w has to be long enough for the concatenation recursion in the 
definition of /' to actually compute all bits of f(y, x *), so |ie| has to be larger 
than |y| + | g{x*)\- All this is guaranteed if we set 

Pf{y,x*) ■■=p g {x>)+ p hi {y,x>) + y + q g {x>) + 1 • 

i= 1,2 



Finally, let / be defined by bounded logarithmic recursion from g , h and r, 

/( 0 , #) = <?(#) 

f(y,x*) = h{y,x*,f(H(y),x>)) for y > 0 

f(y ,x > ) < r(y,x>) , 

and let t g and th be given by induction. In order to define tf, we cannot use log- 
arithmic recursion on y since y is incomplete. Instead we simulate the recursion 
on y by a recursion on a complete argument. 

We first define a function F that yields the values (y) that are needed 
in the recursion as 



S := Au (d (u < e z) 3/ (half (vy))) 

Y := Az w . (A y.y) S w . 

We now use this function to define a term f computing / by recursion on a 
complete argument z by 

T := Auw 1 ^' 7 *^ 11 )/#. |d ((Fu w y) = 0) (t g w y a?) 

(t h xv(Yu-wy) # (vyx*))^J 
f := Aw z . (A y x* . 0) T z 

where the test x = 0 is implemented as d (bit (so x) (len x)) 1 0. Finally, tf is 
defined by identifying the complete arguments in f: 

tf := Aw . f' ww 

To show the correctness of this definition, define 

Pf(y,x*) ~ 2 y + p h (y,x> ,q r (y,x>)) + p g {x>) 

and fix y, x* and w with |w| > Pf(\y\ , jxf ). 

Note that the only values of z for which the function F is ever invoked during 
the computation are H^ k \w) for 0 < k < ||j/||, and that for these values of 2, 
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Y (z, w, y) varies over the values H ^ ( y ). By a downward induction on k we show 
that for these values of z, 

z,y,x*) = f(Y (z, w,y),x>) . 

This implies the claim for tf, since Y(w,w,y) = y. 

The induction basis occurs for k = ||y||, where V(z,w,y) = 0. Since |w| > 
2 |y|, we have z > 0, thus the recursive step in the definition of /' is used, and 
the first branch of the case distinction is chosen. Therefore the equality follows 
from the fact that w is large enough for t g to compute g correctly. 

In the inductive step, we use the fact that Y(H(z),w,y ) = H(Y(z,w, y)), 
and that w is large enough for th to compute h correctly. Since for z = H^ k ~ 1 \w) 
we have Y (z, w, y) > 0, we get 

f(w,z,y,x>) = t h (w,Y(z,w,y),x > ,f(w,H(z),y,x>) 

= t h (w, Y (z, w, y),x*, f(Y (H(z),w, y),x*) 

= th(w, Y (z, w, y),x*, f(H(Y (z, w, y)),x>) 

= h(Y(z,w,y),x*,f(H(Y(z,w,y)),x>)) 

= f{Y(z,w,y),rf) 

where the second equality holds by the induction hypothesis. This completes the 
proof of the claim and the theorem. □ 

5 Soundness 

Definition 3. The length |f | of a term t is inductively defined as follows: For a 
variable x, jar| = 1, and for any constant c other than d, |c| = 1, whereas |d| = 3. 
For complex terms we have the usual clauses |r s| = |r| + |s| and | Xx.r\ = |r| + 1. 

The length of the constant d is motivated by the desire to decrease the length 
of a term in the reduction of a d-redex. 

Note that due to our identification of natural numbers with binary numerals, 
the notation \n\ is ambiguous now. Nevertheless, in the following we will only 
use \n\ as the term length defined above which for numerals n differs from the 
binary length only by one. 

Definition 4. For a list rtf of numerals, define |n*| := max(jnf). 

Definition 5. A context is a list of pairs (x,n) of variables (complete or incom- 
plete) of type l and numerals, where all the variables are distinct. Ifx* is a list 
of distinct variables of type i and nf a list of numerals of the same length, then 
we denote by x*;rif the context (x,n) . 

Definition 6. For every symbol c of our language and term t, jj c (t) denotes the 
number of occurrences of c in t. For obvious aesthetic reasons we abbreviate | 

by #(t). 

Definition 7. A term t is called simple if t contains none of the constants ff, 

CR or LR. 
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Bounding the Size of Numerals 



Lemma 1. Let t be a simple, linear term of type t and x*',nf a context, such 

that all free variables in t are among He*. Then for t* := t[x* := n*] n we have 

|**| <|t| + |n>|. 

Proof. By induction on \t\. We distinguish cases according to the form of t. 

Case 1: t is x T* for a variable x. Since x must be of type i, T* must be 
empty, and t* is just one of the numerals in n*. 

Case 2: t is cT* for a constant c. Here we have four subcases, depending on 
the constant c. 

Case 2a: c is 0, so T* is empty and t. is already normal. 

Case 2b: c is s,;, so t is cr for a term r of type t. Let r* := r[# := n*] nf , 

by the induction hypothesis we have |r*| < |r| + |n*j, and therefore we get 
1**1 <|r*| + l<|i| + |n»| . 

Case 2c: c is one of the constants len, half, drop, bit or sm, so t is cr for 
terms r, ~s* of type i. Let r* := r[x* := n*] nf , by the induction hypothesis we 
have |r*| < |r| + |n*j, and therefore we get \t*\ < |r*| < |t] + | | . 

Case 2d: c is d CT , so t is d CT s Uq u\ W, where s is of type /, and u, are of type a. 
Depending on the last bit i of the value of s[# := n*], t reduces to the shorter 
term t' = Ui v* , to which we can apply the induction hypothesis obtaining the 
normal form t* with \t*\ < |t'| + |n*j < |f| + |n*j. 

Case 3: t is (A x.r) s H*. Here we have two subcases, depending on the number 
of occurrences of x in r. 

Case 3a: x occurs at most once, then the term t' := r[x := s] H* is smaller 
than t, and we can apply the induction hypothesis to t' . 

Case 3b: x occurs more than once, and thus is of type t. Then s is of type t, so 
we first apply the induction hypothesis to s, obtaining s* := s\xf := n*] n with 
|s*| < \s\ + |n^|. Now we let t' := r s^, and we apply the induction hypothesis 
to t' and the context x*, y\ rif, s*, so we get 

n<|t / | + |n > ,«*|<|t / | + M + |n > | • 

The last case, where t is A x.r, cannot occur because of the type of t. □ 



Data Structure 

We represent terms as parse trees, fulfilling the obvious typing constraints. The 
number of edges leaving a particular node is called the out-degree of this node. 
There is a distinguished node with in-degree 0, called the root. Each node is 
stored in a record consisting of an entry cont indicating its kind, plus some 
pointers to its children. We allow the following kinds of nodes with the given 
restrictions: 

— Variable nodes representing a variable x. Variable nodes have out-degree 0. 
Every variable has a unique name and an associated register R[x] . 
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— Abstraction nodes Xx representing the binding of the variable x. Abstraction 
nodes have out-degree one, and we denote the pointer to its child by succ. 

— For each constant c, there are nodes representing the constant c. These nodes 
have out-degree 0. 

— Application nodes @ representing the application of two terms. The obvious 
typing constraints have to be fulfilled. We denote the pointers to the two 
children of an application node by left and right. 

— Auxiliary nodes Hi representing the composition of type one. These nodes 
are labeled with a natural number i, and each of those nodes has out-degree 
either 2 or 3. They will be used to form 2/3-trees (as e.g. described by 
Knuth representing numerals during the computation. We require that 
any node reachable from a /t.-node is either a k. node as well or one of the 
constants So or si. 

— Auxiliary nodes k' representing the identification of type-one-terms with 
numerals (via “applying” them to 0). The out-degree of such a node, which is 
also called a “numeral node”, either is zero, in which case the node represents 
the term 0, or the out-degree is one and the edge starting from this node 
either points to one of the constants So or Si or to a k. node. 

— Finally, there are so-called dummy nodes o of out-degree 1. The pointer to 
the child of a dummy node is again denoted by succ. Dummy nodes serve to 
pass on pointers: a node that becomes superfluous during reduction is made 
into a dummy node, and any pointer to it will be regarded as if it pointed 
to its child. 

A tree is called a numeral if the root is a numeral node, all leaves have the 
same distance to the root and the label i of every Hi node is the number of leaves 
reachable from that node. By standard operations on 2/3-trees it is possible in 
sequential logarithmic time to 

— split a numeral at a given position i. 

— find out the i’tli bit of the numeral. 

— concatenate two numerals. 

So using h! and k. nodes is just a way of implementing “nodes” labeled with a 
numeral allowing all the standard operations on numerals in logarithmic time. 
Note that the length of the label i (coded in binary) of a n-i node is bounded by 
the logarithm of the number of nodes. 



Normalization Algorithms and Their Complexity 

Lemma 2. Let t be a simple, linear term of type l and a context such that 

all free variables in t are among the W. Then the normal form of t\W := nf] 
can be computed in time 0(\t\ ■ log |n*|) by 0(\t\ ■ |rr |) processors. 

Proof. We start one processor for each of the nodes of the parse-tree of t, with a 
pointer to this node in its local register. The registers associated to the variables 
X* in the context contain pointers to the respective numerals nf, and the registers 
associated to all other variables are initialized with a NULL pointer. 
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The program operates in rounds, where the next round starts once all active 
processors have completed the current round. The only processors that will ever 
do something are those at the application or variable nodes. Thus all processors 
where cont (/ {@,:r,d} can halt immediately. Processors at d nodes do not halt 
because they will be converted to variable nodes in the course of the reduction. 

The action of a processor at an application node in one round depends on 
the type of its sons. If the right son is a dummy node, i.e., right. cont = o, 
then this dummy is eliminated by setting right := right. succ. Otherwise, the 
action depends on the type of the left son. 

— If left. cont = o, then eliminate this dummy by setting left := left. succ. 

— If left. cont = Xx, then this /3-redex is partially reduced by copying the 
argument right into the register R[,x) associated to the variable x. The sub- 
stitution part of the /^-reduction is then performed by the processors at vari- 
able nodes. Afterwards, replace the @ and Xx nodes by dummies by setting 
cont := o, left. cont := o and succ := left. 

— If left. cont G {sj, len, half} and the right son is a numeral, right. cont = k', 
then replace the current node by a dummy, and let succ point to a numeral 
representing the result. In the case of s, and half, this can be implemented 
by 2/3-tree operations using sequential time 0(log |n*|). 

In the case of len, the result is equal to the number i of leaves of the numeral 
argument. This value is read off the topmost Kj node, and a numeral of that 
value is produced. Since i is a number of length 0(log |n*|), this can also be 
done in sequential time 0(log In*}). 

— If left. cont = @, left. left. cont € {drop, bit} and right and left. right 
both point to numerals, then again replace the current node by a dummy, 
and let succ point to a numeral representing the result, which again can be 
computed by 2/3-tree operations in time 0(log |n*|). 

— If left. cont = left. left. cont = @, left. left. left. cont = sm and all of 
right, left. right and left. left. right point to numerals, then again the 
current node is replaced by a dummy with succ pointing to the result. 

To compute the result, the lengths i and j are read off the second and third 
argument, and multiplied. As i and j are 0(log |n*|) bit numbers, this can 
be done in parallel time 0(loglog |n*|) by 0(log 3 |n*|) many processors. 
The product i ■ j is compared to the length of the first argument; let the 
maximum of both be k. Now the result is a numeral consisting of a one 
followed by k zeroes, which can be produced in parallel time log 2 k by 0(k) 
many processors using the square-and-multiply method, which suffices since 
k < 0(log|n*|). 

— Finally, if left. cont = d and right. cont = k ', then extract the last bit 
b of the numeral at right, and create two new variables Xq and X\. Then 
reduce the d-redex by replacing the current node and the right son by ab- 
straction nodes, and the left son by a variable node, i.e., setting cont := Ax’o, 
right. cont := Aaq, succ. right, succ. succ := left and left. cont := Xb- 

A processor with cont = x only becomes active when R[x] ^ NULL, and what 
it does then depends on the type of x. 
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If x is not of ground type, then the variable x occurs only in this place, so 
the substitution can be safely performed by setting cont := o and succ := R[x]. 

If x is of type t, the processor waits until the content of register R[x] has 
been normalized, i.e., it acts only if R[x].cont = k' . In this case, it replaces the 
variable node by a dummy, and lets succ point to a newly formed copy of the 
numeral in R[x], This copy can be produced in parallel time 0(log |n*|) by \nf\ 
processors, since the depth of any numeral is bounded by log \ft\. 

Concerning correctness, note that the tree structure is preserved, since nu- 
merals being substituted for type l variables are explicitly copied, and variables 
of higher type occur at most once. Obviously, no redex is left when the program 
halts. 

For the time bound, observe that every processor performs at most one proper 
reduction plus possibly some dummy reductions. Every dummy reduction makes 
one dummy node unreachable, so the number of dummy reductions is bounded 
by the number of dummy nodes generated. Every dummy used to be a proper 
node, and the number of nodes is at most 2|t|, so this number is bounded by 
2 \t\. Thus at most 4 |t| reductions are performed, and the program ends after at 
most that many rounds. As argued above, every round takes at most 0(log |n*|) 
operations with 0(\rt\) many additional processors. □ 

The next lemma is the key to show that all terms can be normalized in NC: it 
shows how to eliminate the constants #, LR and CR. As mentioned in the intro- 
duction, we have to distinguish between the program, i.e., the term we wish to 
normalize, and its input, given as a context. The runtime and length of the out- 
put term may depend polynomially on the former, but only polylogarithmically 
on the latter. 

Since an ordinary 0(-)-analysis is too coarse for the inductive argument, we 
need a more refined asymptotic analysis. Therefore we introduce the following 
notation: 



f(n) < g(n) : f(n) < (1 + o{T))g{n) , 

or equivalently limsup n _ >00 ^ < 1. 

Lemma 3. Let t be a linear term of linear type and Ht',nt a context with all 
free variables of t[xt := rt] incomplete. Then there are a term simp(f, ar; tv) 
and a context y*\ fit such that simp(f, x*; rv)[lp := fft] is simple and equivalent 
to t[at := rt], and which can be computed in time 

T(|#|) < 2 BLR(t) • \t\ ■ (2 #(t) • log |n > |) l)LR(t)+2 



by 

P(\nt\) < \t\ ■ |n>| 2#< ‘ )(ltcR(t)+2) • (log|n > |) 2#< ‘ )(l)LR(t)+1) 
processors, such that 

|simp(t, #; ft)\ < |f| • ^2#^ ■ log \nt\^ and \fft\ < \ft\ 2 
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The proof is somewhat lengthy, so we sketch it first: 

We start by describing the algorithm. It searches for the head-redex and 
reduces it in the obvious way (and then continues in the same way until the 
term is normal): in the case of a ground-type /?- redex enlarge the context, in the 
case of a higher type /3-redex reduce it in the term; in the case of LR the step 
term has to be unfolded only logarithmically many times, so we can just form 
a new term, whereas in the case of CR we have to use parallelism. However, in 
this case the result of every processor is just a single bit, so the results can be 
collected efficiently and returned to the context (whereas in the case of LR the 
result is a term of higher type) . Note the crucial interplay between the length of 
the term, the size of the context, the running time and the number of processors 
needed; therefore we have to provide all four bounds simultaneously. 

After the description of the algorithm, a long and tedious (but elementary) 
calculation follows showing that all bounds indeed hold in every case. The struc- 
ture of the proof is always the same: in the interesting cases a numerical argument 
has to be evaluated in order to be able to reduce the redex (i.e. , the numeral 
we recurse on, or the numeral to be put in the context in the case of a ground 
type /3-redex) . Then the induction hypothesis yields the size of this numeral and 
also the amount of time and processors needed. Then calculate the length of the 
unfolded term. The induction hypothesis for this term yields the amount of time 
and processors needed for the final computation, and also the bounds for the 
final output. Summing up all the times (calculation of the numeral, unfolding, 
final computation) one verifies that the time bound holds as well. 

Proof (of lemma^f. By induction on jj|_R(£), with a side-induction on |£| show 
that the following algorithm does it: 

By pattern matching, determine in time 0(|£|) the form of t, and branch 
according to the form. 

— If £ is a variable or one of the constants 0 or d , then return t and leave x * ; nt 
unchanged. 

— If t is cs* where c is one of the constants Sj, drop, bit, len or sm then 
recursively simplify H*, giving s* and contexts yf] m / , and return cs* 
and Ip ; frt > . 

— If t is dry, then simplify r giving r' and y * ; fit . Compute the numeral 
r* := r'[y* := frt] ni , and reduce the redex dr*, giving t\ and recursively 
simplify t' s* with context ~x*\rit . 

— If t is ff r then simplify r giving r' and y*\ fit . Compute the numeral r* := 
r'[y* := m^] nf , and return a new variable y' and the context y'\ 2^ r I . 

— If t is CR hr, then simplify r giving r' and y * ; fft 1 and compute the numeral 
r* := r’[y* := m^] nf . 

Spawn |r*| many processors, one for each leaf of r* , by moving along the 
tree structure of r*. The processor at bit i of r* simplifies h z in the context 
Ht,z ',nt, [r*/2*J (with z a new complete variable), giving a term hi and 
context yt\ rnf , then he computes h* := hi[yi := m;] nf , retaining only the 
lowest order bit hi. 
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The bits b are collected into a 2/3-tree representation of a numeral m, 
which is output in the form of a new variable z and the context z\ to. 

— t is LR g h m then simplify m, giving m! and Xm ; . Normalize m' in the 

context x*,x^\ rtf, rim , giving to*. Form k numerals rrii = half z (m*) and 
sequentially simplify h to. , giving h' . (Of course, more precisely simplify h x 
for a new variable x in the context extended by x ; to.;.) Then form the term 

t' := // 0 (ft'...(/4 5 )) 

and simplify it. 

— If t is of the form A x.r then recursively simplify r. 

— If t is of the form (A x.r) s and x occurs at most once in r then recursively 
simplify r[x := s] s^. 

— If t is of the form (A x.r) s itf and x occurs several times in r, then simplify 

s giving s' and a context Normalize s' in this context giving the 

numeral s*. Then simplify rH* in the context Htf , x\ rtf , s* . 

For correctness, note that in the case d r If* simplifying r takes time 
< 2 ttLR(r) • |r| • (2*^ r) ■ log M)“ LrW+2 

and uses 

< |r| • |#| 2#M (taW+2) (log| # |) 2 #M «LR(r) + l) 

many processors. For the output we have \r'\ < |r| • (2 # ( r ) • log |riA|)* LR ^ and 

\rftf\ < \rtf\ 2 . Hence the time used to normalize r' (using the algorithm of 
lemma I2J is 0(|r'| • log |m/|), which is (order of) 

/ „ , . \ BlrM+I 

|r| • (2 tiLR ^ r ^ • log \rtf\) 

and the number of processors needed is 0(|r'| • |m*|) < |r| • \n*\ 2 +1 . Finally, 

to simplify t'ltf we need time 

< 2 Blr( ^ ) • (| s>| + 3) • (2 #(¥>) • log |n»|)M^)+2 
and the number of processors is 

< (|#| + 3) |^| 2#( ^(»cr(^)+ 2) (log #) 2*->(|M*>)+l) 

Summing up gives an overall time that is 

< 2 #LR(t) • (|r| + |s>| + 3) • (2 #(t) • log |ri > |y LR(f)+2 

which is a correct bound since |d| = 3. Maximizing gives that the overall number 
of processors is 

< \t\ • |^| 2#(t, (taW+2) (log |#|) 2#Wdtu,(t )+1) 
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The length of the output term is 

< (|#|+3)-(2#^Mog|#|) Ml?>) 

and the size of the output context is < \W\ , which suffices. 

In the case #rwe obtain the same bounds for simplification and normal- 
ization of r as in the previous case. For r* we get 

|r*| = 0{\r'\ + \7n>\) < |fP > | 2#< 

Computing the output now takes time 

log |r*| 2 = 2 # ^ +1 • log |n*| 



and 



< 



2#( r )+ 1 



many processors. Thus the overall time is 

< 2 #LR(r) • |r| • (2 #(r)+1 • log |r^|) aLR(r)+2 



and the number of processors is 

< |r| • |^| 2#M+1 (ttcRM+2) . ( log |n>|)2 #(r) (tt LR (r)+i) _ 

The length of the output term is 1, and the size of the output context is bounded 
by |r*| + 1 < |7F | , which implies the claim. 



In the case CR hr note that the arguments h: t — > i and r: l both have 
to be present, since t has to be of linear type (and CR: ((.—>■ l) t — > l). We 
obtain the same bounds for simplification and normalization of r and the length 
of the numeral r* as in the previous case. Spawning the parallel processors and 
collecting the result in the end each needs time log |r*| = 2#( r ) • |n*|. The main 

work is done by the |n*| many processors that do the simplification and 
normalization of the step terms. Each of them takes time 

< 2 ttLRW • (\h\ + 1) • ( 2 #w • log |nV*|) “ LR(ft)+2 

< 2 #lrW • (\h\ + 1) • ^2 #w+#(r) • log \ n*\J ^ lr(/i)+2 
and a number of sub-processors satisfying 

< (\h\ + 1) • |#,r*| 2#W(fcR(/l)+2) • (log |n*, r*|) 2#<, “ ) (#L R (?i)+ 1 ) 

< (\h\ + 1) • |^|2 #(M+#M (#crW+2) . (2 #(r) log |#|) 2#W ( #u,(/») +1 ) 
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< (\h\ + 1) • |n>| 2 #w+#(r) (#«W+2) . (log| n> |) 2 #(h > +#w (ll LR W+ 1) 



to compute hi and yt\ mj with 

N<(M + l)-(2 #W -log|#,r*|) 
<(|ft.| + l)-(2 # W +# W-log|n > |) 

and 



Ulr C ft-) 
Hlr(/») 



nii 1 



i — y * i ^ 
< ,r 



#(M 2 #(h) + #(r) 

< n> - 



Now the normal form h* is computed in time 



0(\hi\ ■ log \fnt\) < (|/i| + l)- (2#W+#M-log|r^|) 



Blr(^)+i 



by 



0(\hi\ ■ \fn?\) < (|/i| + 1) |n > | 2#( ,+#< )+1 • (log |n>|) 2#t ' ,)+#(r) (#'-R(^)+i) 



many sub-processors. Summing up the times yields that the overall time is 
< 2 ttLR(r) • |r| • (2 #(r) • log |n > |) #LR(r)+2 

+ 2 ttLRW • (|/i| + 1) • (2 #(/t)+#(r) • log |n > |y LR(/l)+2 
2 »L R (t) . (| r | + |/j| + i) . ( 2 #(‘) • log 1#^ 



< 



rW+2 



The number of sub-processors used by each of the |r*| processes is 
< (\h\ + 1) • |#| 2#(h)+#M (fc R W+2) . (log 

and multiplying this by the upper bound |n*|“ on the number of processes 
yields that the bound for the total number of processor holds. The output term is 

of length 1, and the length of the output context is bounded by |r*| < |n*| 2 



In the case LR ghmlr note that, as t has linear type, all the arguments up 
to and including the m have to be present. Moreover, h is in a complete position, 
so it cannot contain incomplete free variables, therefore neither can do any of 
the h' ; so t' really is linear. Due to the typing restrictions of the LR the step 
functions hm. have linear type. So in all cases we’re entitled to recursively call 
the algorithm and to apply the induction hypothesis. For calculating m* we have 
the same bounds as in the previous cases. We have k ~ log |m*| < log |n*|. 

The time needed for calculating the h! is 

< k2^ h \\h\ + 1)(2 # W log |n>,TO*|) #LR(/l)+2 
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< 2 kR ( h \\h\ + l)(2 #(/l)+#(m) log |n>|)#LRW+ 3 



For the length 



we have 

< (\h\ + 1) (2#W • log |#,m*|) #LR(/l) 

< (\h\ + 1) log |n > |) IM ' i) 

=F> 



h r 



and the length of the numerals rv in the contexts output by the computation 
of the h is bounded by \w ,m | < (|rr| ) = \rv \ . For 

the length of t’ we have 



\t’\ < k 



b! 



\g\ 



< (|/i| + | 5 | + |s| +1) (2#W+#( ro )log|n*|) 
So the final computation takes time 






< 2 tiLR(t ' ) \t’\ (2 #(t,) log 



n, n' 



iLR(t')+2 



< 2 ti ‘- R ^+ 6 ^ (\h\ + \g\ + + l){2*^ + *^ log |n>|) #LR(,l)+1 

. ( 2 #(s)+#( s ) > . log I I ) *^lr ( sl+tlLR (s) > +2 

< 2 ttLR(ff)+tiLR(s)> (\h\ + \g\ + |sf + l)(2 #(t) log |n>|)#LR(ft')+i+tlLR(9)+#LRW , +2_ 

So summing up all the times one verifies that the time bound holds. The number 
of processors needed in the final computation is 

|2 # (‘ , >()t CR (t')+2) 



< \f 

< 



W , TV 



(log 



— > z= ^ > 
7V, TV 



(\h\ + \g\ + fsf + 1) [ 2 #^+#^ log |#|) 

, |2 #( m ) + #('*)\ 2#(t ) (#CR(t')+ 2 ) 

^1 ) 

2 #(t,) (#LR(t') + l) 



2 #< *' ) (tiLR(t') + l) 

#LR(ft) + l 



• ( 2 # ( -H#Wlog|#|)' 

< m + | fl | + jjf + i) . |n>|2 #M+#<M+# C‘', ( fc R (0 + 2) 



(2 # ( ro ) +#(/l) log|n > |) 



« LR (ft) + l+2#( t, )(|J LR (t') + l) 



< 



|f| • |n>| 2#(t) (i tcR W+ 2 ) . ^2 #(m)+#(?l) log |#|^ 
(log I#!) 



2 #( *b()J LR (/ l ) + )J LR ( t ') + 2) 



< \t] ■ | 7 j>| 2#(t) (#c R (t)+2) . n^,5T>i \2* (m)+ * (h)+#(t 'HtL R (h)+kR(t')+2) 



The context finally output is bounded by < 
The length of the final output is bounded by 



— > 

w , w 



2#(‘b 



< I# 



2 #(m)+#(h)+#(t') 



< \t'\ ■ (2 #(t ' } log 



— > 

TV, TV 



Blr ( i 7 ) 
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< (\h\ + \g\ + + \)(2* w+ *^ m) log |#|) #lrW+1 

. log 1 7jt-|)t)L R (t') 

< (\h\ + |<?| + N* + log |#|)#LR(t')+tlLR(fc)+l 

So all bounds hold in this case. 

In the case A x.r note that due to the fact that t has linear type x has to 
be incomplete, so we’re entitled to use the induction hypothesis. 

In the case (A x.r) s with several occurrences of x in r note that due 

to the fact that t is linear, x has to be of ground type (since higher type variables 
are only allowed to occur once). The time needed to calculate s' is bounded by 

2 # lr (s) | s | (2#(») log |n> |)# LR (s)+ 2 

and the number of processors is not too high. For the length of s' we have 

|s'| < | S |.( 2 #( s )log|n>|)Ms) 

. . 2#( s ) 

and \rfv | < \rv | .So the time for calculating s* is bounded by 
< | s' | log \n?,nt\ < |s| • (2 #(s) log|n > |) #LR(s)+1 
For the length of the numeral s* we have 




So the last computation takes time 



2 «LR(r-Td) . | r g>| (V (r ^ )+#(s) log|n>|) 



JArRs*)-^ 



Summing up, the time bound holds. The number of processors needed for the 
last computation is bounded by 



< 



r s' 



• (M 2#<s) ) 



2 #(a)\ 2 #(r ^>0tcR(rTd)+2) 



^2 #(s) log|n > |) 



2 #( ’'^ ) (#LR(rs > ) + l) 



< \ t \ . |^|2 #W+#( ^ ) «CR(r^)+2) ( log |^|)2 #(s)+#(r ^ ) (jLR(r^>) + l) 



■ 2 #(r,r+) , , <>#(«)+#(»•»») 

The context finally output is bounded by |rr , s*| < \ W \ 



In all other cases the bounds trivially hold. □ 

We conclude that a term of linear type can be simplified by an NC algorithm, 
where the degree of the runtime bound only depends on the number of occur- 
rences of LR, and the degree of the hardware bound only depends on the number 
of occurrences of # and CR. More precisely, we have the following corollary. 
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Corollary 1. The term simp(i, aT^" ; ) and the new context y*\ mf in the above 

lemma can be computed in time 

T(| n*|) < 0((log|n > |) #LR(t)+2 ) 

by a number of processors satisfying 

P(|n>|) < 0(|n > | 2#( ‘ )(ttcR(t)+3) ) . 

Theorem 3. Let t be a linear term of type PA — > i. Then the function denoted 
by t is in NC. 

Proof. Let rif be an input, given as 2/3-tree representations of numerals, and 
complete variables of type l. Using LemmaEl we compute t' := simp(f x^, x*;rA) 
and a new context with It'l < (log | and |fn/| < in time 

(log by | | °( 1 ) man y processors. 

Then using Lemma Qwe compute the normal form t'[y* := m^] n in time 
0(\t'\ • log | rn? | ) = (logln*!) 0 ^) by 0{\t'\ Im^l) = |n*|°^ many processors. 

Hence the function denoted by t is computable in polylogarithmic time by 
polynomially many processors, and thus is in NC. □ 

From Theorems Inland 0 we immediately get our main result: 

Corollary 2. A number-theoretic function f is in NC if and only if it is denoted 
by a linear term of our system. 
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Abstract. We introduce a general purpose typed A-calculus A°° which 
contains intuitionistic logic, is capable of internalizing its own derivations 
as A-terms and yet enjoys strong normalization with respect to a natural 
reduction system. In particular, A°° subsumes the typed A-calculus. The 
Curry-Howard isomorphism converting intuitionistic proofs into A-terms 
is a simple instance of the internalization property of A°° . The standard 
semantics of A°° is given by a proof system with proof checking capaci- 
ties. The system A°° is a theoretical prototype of reflective extensions of 
a broad class of type-based systems in programming languages, provers, 
AI and knowledge representation, etc. 



1 Introduction 

According to the Curry-Howard isomorphism, the calculus of intuitionistic pro- 
positions (types) and the calculus of typed A-terms (proof terms) constitute a 
pair of isomorphic though distinct structures. Combining those logical and com- 
putational universes has been considered a major direction in theoretical logic 
and applications (propositions-as-types, proofs-as-programs paradigms, etc., cf. 
Hanna). Modern computational systems are often capable of performing log- 
ical derivations, formalizing their own derivations and computations, proof and 
type checking, normalizing A-terms, etc. A basic theoretical prototype of such 
a system would be a long anticipated joint calculus of propositions (types) and 
typed A-terms (proofs, programs). There are several natural requirements for 
such a calculus raising from the intended reading of the type assertion t : F as a 
proposition t is a proof of F. 

1. A type assertion t : F should be treated as a legitimate proposition (hence 
a type) . The intended informal semantics of t : F could be t is a proof of F or 
t has type F. In particular, t : F could participate freely in constructing new 
types (propositions). Such a capacity would produce types containing A-terms 
(proofs, programs) inside. For example, the following types should be allowed 
t : F — » s : G, t : F A s : G, s : (t : F), etc. In programmistic terms this means 
a possibility of encoding computations inside types. A system should contain 

* The research described in this paper was supported in part by ARO under the MURI 
program “Integrated Approach to Intelligent Systems”, grant D A AH04-96- 1-0341, 
by DARPA under program LPE, project 34145. 
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both the intuitionistic logic (as a calculus of types) and the simply typed 
A-calculus. 

2. Such a system should contain the reflection principle t : F —> F rep- 
resenting a fundamental property of proofs if t is a proof of F then F holds. 
An alternative type-style reading of this principle says if t of type F then F is 
inhabited. 

3. A system should contain the explicit proof checking (the type checking) 
operation “!” and the principle t:F—>\t:(t:F). The informal reading of “!” is 
that given a term t the term It describes a computation verifying t:F (i.e. that 
t is a proof of F, or that t has type F). The proof checking and the reflection 
principles enable us to naturally connect relevant types up and down from a 
given one. Such a possibility has been partly realized by means of modal logic in 
modal principles OF— >F and OF— respectively, though this presentation 
lacks the explicit character of typed terms (cf. j3Ej). 

4. A system should accommodate the Curry-Howard isomorphism that maps 
natural derivations in intuitionistic logic to well defined typed A-terms. A funda- 
mental closure requirement suggests generalizing the Curry-Howard isomorphism 
to the Internalization Property of the system: if Ai, A 2 , . ■ ■ , A. n b B then for 
fresh variables xi,X 2 , ■ ■ • ,x n and some term t(x i,X 2 , ■ ■ ■ , x n ) 

xi : Ax, X 2 ■■ A 2 , . . . , x n : A n b t(x i,x 2 , ■ ■ . , x„) : B. 

5. Desirable properties of terms in such system include strong normalizability 
with respect to /3-reduction, projection reductions, etc., as well as other natural 
properties of typed A-terms. 

The main goal of this paper is to find such a system. The reflective A-calculus 
A°° introduced below is the minimal system that meets these specifications. 

The Logic of Proofs LP ( I1I2I3| ) where proofs are represented by advanced 
combinatory terms (called proof polynomials), satisfy properties 1-4 above. The 
property 5 is missing in LP since the combinatory format does not admit reduc- 
tions and thus does not leave a room for nontrivial normalizations. 

Reflexive A-terms correspond to proof polynomials over intuitionistic logic in 
the basis { — >•, A}. Those polynomials are built from constants and variables by 
two basic operations: (application), “!” (proof checker) . The logical identities 

for those proof polynomials are described by the corresponding fragment of the 
Logic of Proofs LP. In LP the usual set of type formation rules is extended by 
a new one: given a proof polynomial p and formula F build a new type (proposi- 
tional formula) p:F. Valid identities in this language are given by propositional 
formulas in the extended language generated by the calculus having axioms 

AO. Axiom schemes and rules of intuitionistic loqic 

Al. t:F -+ F 

A2. t:(F G) ->(s:F^(t-a):G) 

A3. t:F — ► \t:(t:F) 



“verification 
“application 
‘proof checker” 
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Rule axiom necessitation : 

given a constant c and an axiom F from A0-A3 infer c:F. 

The standard semantics of proof polynomials is operations on proofs in a 
proof system containing intuitionistic logic. The operation “application” has the 
usual meaning as applying a proof t of F — > G to a proof s of F to get a proof 
t ■ s of G. The “proof checking” takes a proof t of F and returns a proof It 
of the sentence t is a proof of F. Logic of Proofs in the combinatory format 
finds applications in those areas where modal logics, epistemic logics, logics of 
knowledge work. 

The A-version of the whole class of proof polynomials is not normalizing. 
Indeed, the important feature of proof polynomials is their polymorphism. In 
particular, a variable can be assigned a finite number of arbitrary types. If a 
variable x has two types A and A— >A then the A-term (Xx.xx) ■ (Xx.xx) does 
not have a normal form. In order to find a natural normalizing subsystem we limit 
our considerations to A-terms for single-conclusion (functional) proof systems, 
i.e. systems where each proof proves only one theorem. Such terms will have 
unique types. Proof polynomials in the combinatory format for functional proof 
systems has been studied in m- 

Requirements 1-4 above in the A-term format inevitably lead to the system 
A°° below. The key idea of having nested copies of basic operations on A-terms 
in A 00 can be illustrated by the following examples. By the Internalization Prop- 
erty, a propositional derivation yields a A-term derivatioiHI 



x 1 :(A->B),y 1 :Ah (x 1 oy 1 ):B, 



which yields the existence of a term t(x 2 ,y 2 ) such that 



x 2 -.x 1 :(A-tB),y 2 :yi:A\- t(x 2 , y 2 ) '(xioyi): B. 



(here and below is meant to be right-associative). Naturally, we opt to accept 
such t as a basic operation and call it o 2 . The defining identity for o 2 is thus 

x 2 :x 1 :(A^-B),y 2 :yi:A\- (x 2 o 2 y 2 ) : (aq o yi ):B, 

or, in the alternative notation t F for type assignments 

xl 1 , y v 2 H ( x 2 o 2 y 2 ) {xi ° Vl)B . 

Likewise, the Internalization Property yields the existence of the operation o 3 
such that 



x 3 :x 2 :xi:(A^-B),y 3 :y 2 :y 1 :A b (£3 o 3 y 3 ):(x 2 o 2 y 2 ):(x 1 o yi):B 

Similar nested counterparts appear for A-abstraction, pairing, and projections. 

1 Here “o” denotes application on A-terms. 
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New series of operations are generated by the reflection and proof checking 
principles. The reflection has form t : A b A. The internalized version of this 
derivation gives a unary operation JJ. such that 



x\ :t: A b JJ-xi : A. 

Likewise, for some unary operation JJ. 2 



x 2 : x± : t : A b JJ. 2 a: 2 : fieri : A, 



etc. 

The proof checking derivation t: A b \t : t : A gives rise to a unary operation 
ft such that 

X\:t:A\~ f|-a:i :\t:t:A. 

Further application of Internalization produces ft 2 , ft 3 , etc. such that 



X 2 '■ Xi : t : A b ft 2 cc2 : fpri :\t:t:A, 



x 3 : x 2 : Xi : t : A b ft 3 x 3 : 1t 2 %2 : ft^i : : t : A, 



etc. 



Theorem 1 below demonstrates that such a set of nested operations on A- 
terms is in fact necessary and sufficient to guarantee the Internalization Property 
for the whole of A°° . Therefore, there is nothing arbitrary in this set, and nothing 
is missing there either. 

Normalization of terms depends upon a choice of reductions, which may vary 
from one application to another. In this paper we consider a system of reductions 
motivated by our provability reading of A°° . We consider a normalization process 
as a kind of search for a direct proof of a given fixed formula (type). In particular, 
we try to avoid changing a formula (type) during normalization. This puts some 
restriction on our system of reductions. As a result, the reflective A-terms under 
the chosen reductions are strongly normalizable, but not confluent. A given term 
may have different normal forms. An additional motivation for considering those 
reductions is that they extend the usual set of A-calculus reductions and subsume 
strong normalization for such components of A°° as the intuitionistic calculus 
and the simply typed A-calculus. 



2 Reflective A-Calculus 

The reflective A-calculus A°° below is a joint calculus of propositions (types) 
and proofs (A-terms) with rigid typing. Every term and all subterms of a term 
carry a fixed type. In other words, in A°° we assume a Church style rigid typing 
rather than a Curry style type assignment system. 
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2.1 Types and Typed Terms 

The language of reflective A-calculus includes 

propositional letters Pi,P 2 ,P 3 , ■ ■ ■ 

type constructors (connectives) — >,A 

term constructors (functional symbols): unary ! , •f|' n , JJ-", 7 Tq, 7r"; binary o", 
p n , for n = 1, 2 , 3 . . . 

operator symbols A 1 , A 2 , . . A”, . . 

a countably infinite supply of variables X\,X 2 , £ 3 , ... of each type F (defini- 
tion below), each variable a; is a term of its unique pre-assigned type. 

Types and (well-typed, well-defined, well-formed) terms are defined by a simul- 
taneous induction according to the calculus A°° below. 

1. Propositional letters are (atomic) types 

2. Types (formulas) F are built according to the grammar 

F = p | F^F | FAF | t:F, 

where p is an atomic type, t a well-formed term having type F. Types of format 
t : F where t is a term and F a type are called type assertions or quasi-atomic 
types. Note that only correct type assertions t: F are syntactically allowed inside 
types. The informal semantics for t:F is t is a proof of F; so a formula 

tn '• tn—i : . . . : t\ : A 

can be read as “t n is a proof that t n - 1 is a proof that ... is a proof that t\ proves 
A” . For the sake of brevity we will refer to types as terms of depth 0. 

3. Inhabited types and well- formed terms (or terms for short) are constructed 
according to the calculus A°° below. 

A derivation in A°° is a rooted tree with nodes labelled by types, in particular, 
type assertions. Leaves of a derivation are labelled by axioms of A°° which are 
arbitrary types or type assertions x : F where A is a type and x a variable of 
type F. Note that the set of axioms is thus also defined inductively according 
to A°°: as soon as we are able to establish that F is a type (in particular, for a 
quasi-atomic type s : G this requires establishing by means of A°° that s indeed 
is a term of type G), we are entitled to use variables of type F as new axioms. 

A context is a collection of quasi-atomic types x\ : A\, X 2 : A 2 , . . . , x n : A n 
where Xi, Xj are distinct variables for i 7 ^ j . A derivation tree is in a context r 
if all leaves of the derivation are labelled by some quasi-atomic types from T. 

A step down from leaves to the root is performed by one of the inference 
rules of A°°. Each rule comes in levels n = 0, 1, 2, 3, . . .. A rule has one or two 
premises which are types (in particular, type assertions), and a conclusion. The 
intended reading of such a rule is that if premises are inhabited types, then the 
conclusion is also inhabited. If the level of a rule is greater than 0, then the 
premise (s) and the conclusion are all type assertions. Such a rule is regarded 
also as a term formation rule with the intended reading: the conclusion t: F is 
a correct type assertion provided the premise(s) are correct. 
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If t:F appears as a label in (the root of) a derivation tree, we say that t is a 
term of type F. We also refer to terms as well-defined, well-typed, well-formed 
terms. 

In A°° we use the natural deduction format, where derivations are represented 
by proof trees with assumptions, both open (charged) and closed (discharged). 
We will also use the sequent style notation for derivations in A°° by reading 
r b F as an A°°-derivation of F in r. Within the current definition below we 
assume that n = 0,1,2,... and v = (vi,v 2 , ■ • ■ , v n ). In particular, if n = 0 then 
v is empty. We also agree on the following vector-style notations: 

t : A denotes t n : f„_i (e.g. t : A is A, when n = 0), 

t:{A 1 ,A 2 , . . .,A n } denotes {t 1 :A 1 ,t 2 :A 2 , . ..:,t n :A n }, 

\ n x.t:B denotes \ n x n .t n : X n ~ 1 x n -i.t n -i Aaq.ti : B, 

(t o n s ) :B denotes (t n o n s n ) : (t„_i o n_1 s n _ 1 ) . ,:{t\ osi) :B , 

f n t:B denotes f[ n t n : f[ n ~ 1 t n _ 1 : . ..\f[t 1 -.B, 

likewise for all other functional symbols of A°°. 

Derivations are generated by the following clauses. Here A, B,C are formulas, 
r a finite set of types, s , t, u are n-vectors of pseudo-terms, x are n-vectors of 
variables, n = 0 , 1 , 2 , 

Natural deduction ride Its sequent form 

(Ax) x : A r b x : A, if x : A is in r 



t : B 

(A) 

V ’ A n x.t:(A^B) 

provided x„ : x n -i : . . .:x± :A, Xi occurs free neither in tj for i 7 ^ j nor in A^B. 
Premises corresponding to x n : x n -\ x\ \ A (if any) are discharged. In the 
full sequent form this rule is 

H, x n . x n _ 1 X \ . A b t n . t n _ 1 ti . B 

r b A n x n .t n : X n ~ 1 x n -i.t n -i Xxi.ti : (A^B) 
where none of x occurs free in the conclusion sequent. 

All the rules below do not bind/unbind variables. 



(App) 



t:(A—>B) s:A 
(t o n s ) : B 



r b t:(A->B) r b s:A 

rb(fo n s):B 



(P) 



t : A s:B 
p n (t,s):(AAB) 



r \- t:A rv- S-.B 
r b p n (t, s ) : {A A B) 
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t: (Ao A Ai) , F b t : (Ao A A\) 

(* = 0 , 1 ) 

irft-.Ai r\-ir?t:Ai 



(tf) 



t:u:A 



f| ' n t : ! u:u:A 



r b t-.u-.A 
r b f| - n t: \u:u:A 






t:u: A 
J ), n t:A 



r h t:u:A 
rhl), n t:A 



Remark 1. The intuitionistic logic for implication/conjunction and A-calculus 
are the special cases for rules with n = 0 and n = 1 only, respectively, if we 
furthermore restrict all of the displayed formulas to types which do not contain 
quasi-atoms. 



Example 1. Here are some examples of A°°-derivations in the sequent format 
(cf. 3.2). We skip the trivial axiom parts for brevity. 

y:x: A b JJ-y: A y.x:A\-^y.\x:x:A 

1) b Xy.tyy: (x: A— » A) 2) h Xy.ity: (x: A— > \x:x\ A) 

However, one can derive neither b A — > x : A, nor b Aa;.!x : (A — > x : A), since x 
occurs free there. 



u:x:A,v:y:B\-p 2 (u,v):p(x,y):(A/\B) 
u:x:A b A 2 tup 2 (t(, v) :Xy.p(x, y ) : (B— > (A A B)) 

3) b X 2 uv.p 2 (u, v) :Xxy.p(x, y ) : ( A — > (B — > ( A A B))) 

u:x:A,v:y:B b p 2 (w, v) : p(x, y) : (A A B) 
u:x: A,v:y:B b frp 2 (M, v) : !p(x, y) : p(x, y):(A A B ) 

4) b Xuv. ftp 2 (u, v):(x:A—>(y:B—>\p(x,y): p(x, y) : (A A B))) 

Note that unlike in the previous example we cannot introduce A 2 in place of A 
at the last stage here since the resulting sequent would be 

b A 2 uu.'f|'p 2 (u, v ) : Xxy.\p(x, y) : (A — > (B— >-p(x, y) : (A A B))) 

containing abstraction over variables x, y which are left free in the conclusion, 
which is illegal. 

Here is an informal explanation of why such a derivation should not be per- 
mitted. Substituting different terms for x and y in the last sequent produces 
different types from A — > (B — > p (x,y) : (A A £?)), whereas neither of the terms 
Xxy.\p(x,y) and A 2 uu.'f|'P 2 (u, v) changes after such substitutions. This is bad 
syntactically, since the same terms will be assigned different types. Semantically 
this is bad either, since this would violate the one proof - one theorem convention. 
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Proposition 1. (Closure under substitution) Ift(x) is a well-defined term,, x a 
variable of type A, s a term of type A free for x in t(x), then t(s) is a well-defined 
term of the same type as t(x). 

Proposition 2. (Uniqueness of Types) If both t : F and t : F' are well-typed 
terms, then F = F' . 

Theorem 1. (Internalization Property for A °° ) Let A°° derive 

A \ , A 2 , . . . , A m I - B . 

Then one can build a well-defined term t(x 1 , 2 : 2 , . . . ,x m ) with fresh variables x 
such that A 00 also derives 

xi :A 1 ,x 2 :A 2 , . . . ,x m :A m b t(x i,x 2 , . . .,x m ):B. 

Proof. We increment n at every node of the derivation A\, A 2l . ■ ■ , A m b B. The 
base case is obvious. We will check the most principal step clause (A) leaving 
the rest as an exercise. Let the last step of a derivation be 

r, y n : y n -i : . . . : yi : A b t n : t n -i : . . . : h : B 

T b X n y n .t n : \ n ~ 1 y n -i.t n -i : . . .:Xyi.h :(A->B) 

By the induction hypothesis, for some term s(x, x rn+ i) of fresh variables x, x m +i 

x . r , . y n . pn — 1 yi • A b 5 ( 3 ?, . t n . t n —\ t \ . B . 

Apply the rule (A) for n + 1 to obtain 

x'.r b A y n .t n :X y n —\.t n —\‘....‘.Xy\.t\‘.(A > B), 

and put t(xi,x 2 , ■ ■ ■ ,x m ) = \ n+1 x m+1 .s(x, x m+ i). 



2.2 Reductions in the A 00 Calculus 



Definition 1. For n = 1,2, , . ., the redexes for A°° and their contracta (written 
as t> t' , for t a redex, t' a contractum oft) are: 

- (\ n x n .t n )o n s n : ... : (Axi-t^os! : F > t n [s n /x n \ : ... : ii[si/a;i] : F 
(/3 n - contraction) 



i—jif+n j.n\ . _ 

P Ai) 






t? : f 



■tj : F, * = 0,1 



- Wtn : 1 : • • ■ : Uh : F > t n : t n _r : . . . : h : F 



Before defining the reduction relation on reflective A-terms (Definition 5) we will 
come with some motivations (Remarks 2 and 3). 

Remark 2. The system should allow types of normalized terms to contain terms 
which are not normal. The object, then, is to normalize a term of any type, 
regardless of the redundancy of the type, and focus on eliminating redundancy 
in the term. It has been noted, for instance, that the statement of the normal- 
ization property for a term system, contains “redundancy” , because it says that 
something which is not normal, is equivalent to a normal thing. 
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Remark 3. In the simply-typed and untyped A-calculi, reduction of terms is de- 
fined as contraction of the redexes which occur as subterm occurrences of the 
term. For A°°, the situation is a little trickier. We want to be sure that, when 
reducing a term t n of type f„_i : . . . : t\ : A, we end up with a well-typed term 
t’ n having type t' n _ 1 : . . . : t! x : A. I.e., we don’t want the formula A to change 
under reduction, while the intermediary terms tj , may be allowed to change in 
a predictable way. 

An example, illustrating the difficulties arising when we allow any subterm 
occurrence of a redex to contract, is the A°°-term 

£ 3 : £ 2 : £1 : A h -ft 2 £3 : f[x 2 : \xi : X \ : A 
£3 : £ 2 : £1 : A h Jj- 2 'f |' 2 £ 3 : ^-f\x 2 : £1 : A 
b A£ 3 .J v l 2 1> 2 £ 3 :(£2:£i:A->JJ.ff£ 2 :£i:A) 



Let us use a blend of type assertion notations t F and t : F to show the type of 
the subterm JJ- 2 'f|' 2 £ 3 of the resulting term more explicitly. 

[A £ 3 .(J v l 2 'f|' 2 £ 3 )^ X2:a:i :A ] 

In principle, the subterm (J| 2 'ff 2 £ 3 )'^1"l' a:2::!:i:A is a redex. However, from the proof 
theoretical point of view it would be a mistake to allow the whole term to reduce 
to 

[A^s]^ 1 ^* 2 ^). 

Indeed, the formula (type) r 

£2 :£i : A— > JJ-^’2 :£i : A 

is a nontrivial formalized special case of so-called subject extension. We do not 
want to change this formula (type) in a normalization process to 

£2 :£i : A— \x 2 :xi : A 



which is a trivial proposition. Speaking proof theoretically, while normalizing we 
are looking for a normal proof of a given proposition (here it is formula r) rather 
than changing a proposition itself in a process of finding a “better” proof of it. 
Abandoning r in the normalization process would not allow us to find a natu- 
ral normal inhabitant (normal proof) of r. The principal difficulty here is that 
the main term is “lower level” than the subterm occurrence being contracted. 
Accordingly, we develop a notion of subterm occurrences sitting properly in a 
term. 

Definition 2. Fort a well-defined A 00 -term, t is of level n (written, £ev(t) = n) 
if the bottom rule in the derivation tree has superscript n. 

Definition 3. For to a subterm occurrence of a A°°-termt, with £ev(to) = n, we 
say that to is properly embedded in t (to Q P t) if there are no subterm occurrences 
in t containing to of the forms: A m x.s, s o m r, p m (s, r), 7r™s, with m < n 
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Speaking vaguely, we require that redexes be properly embedded in all subterms 
except, may be, the “proof checkers” We will show below that this require- 
ment suffices for type preservation, and strong normalization of A°°-terms. 

We may now define reduction of A°°-terms. 

Definition 4. (a) For any well-defined A 00 -terms t and t! , t one-step reduces 
to t' (t >-! t! ) ift contains a properly embedded subterm occurrence s of the form 
of one of the redexes above, s> s' , and t' = t[s' / s\. 

(b) The reduction relation >: is the reflexive and transitive closure of >- 1. 

Equipped with the above-defined notion of reduction, A 00 enjoys (suitably mod- 
ified versions of) the most important properties of the simply typed lambda 
calculus. The elementary property of type preservation under reduction holds, 
in a modified form, namely, for t : F a well- typed term, if t >-j t' then t' : F' is 
well- typed, for a naturally determined type F' . More precisely, 

Proposition 3. (Type Preservation) Let t n be a well-typed term of level n, hav- 
ing type t n _ 1 ti : A. If t n y 1 t' n , then t' n is a well-typed level n term. 

Furthermore, t' n has type t' n _ 1 t[ : A, with ti >-1 t\ or ti = t \ , for 

l<i<n-l. 

Proof. t n >-1 t' n , by definition, only if there exists s n Q p t n ,s n t> s' n , and t' n = 
t n [s' n /s n ]. We proceed by induction on \t n \, and on the number of subterm 
occurrences between t n and s n , and construct a derivation in A°° of the new 
term t' n . The construction of this derivation is analogous to the construction of 
derivations in intuitionistic logic under detour reduction. 

We get a form of subject reduction as an immediate consequence: 

Corollary 1 . (Subject Reduction) F b t : A =>• r h t' : A, for t n >-1 t' n . 



Remark f. It is easy to see that the derivation of t' n obtained in the proof above, 
is uniquely determined by t' n , and the derivation of t n . In this way, we associate 
to each reduction of terms, a corresponding reduction of derivations. Under this 
operation of simultaneous term and derivation reduction, we see how the inter- 
nalization property of A°° is a generalization of the Curry-Howard isomorphism. 
Namely, the square in Figure Q commutes, where 1 is the map given by the In- 
ternalization Property, and >-1 is the map given by one-step reduction. 



Example 2. We may modify the term from the above example, to obtain a term 
with a properly embedded redex: 



x 3 : x 2 : X\ : A b -f|' 2 ^3 : lT^2 : \x\ : X\ : A 



x 3 : x 2 : aq : A h JJ- 2 'f|' 2 a;3 : iff[x 2 :xi :A 
b \ 2 x 3 .]f 2 f\ 2 x 3 : \x2.ty-fjX2 \(x\\A-tx\\A) 
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V 

t n '■ A 



yi yi 



V 

t'n ■ A 

Fig. 1. Generalization of the Curry-Howard isomorphism 

The (properly embedded) redex JJ- 2 'fl 2 a ; 3 contracts to X3, and the term derivation 
and type are modified accordingly: 

X3 : X2 : X \ : A b X3 : X2 ■ X \ : A 
b \ 2 X3-X3 : \x2-X2 ■ ( X \ : A— >x \ : A) 

The result really is a normal proof of the statement inside the parentheses. 

Proposition 4. The reduction relation is not confluent. 

Proof. By a counterexample. Consider the following well-defined terms (missing 
types can be easily recovered). 

P 2 (x 2,V2) ■ p(xi,yi) ■ (AaB) 

\X2-P 2 {x 2 ,y2) ■ (xi : A-> p(x!,yi) : ( A A B)) 

\X2-P 2 (x2,y2)) ° 2 : p(xi,yi) .(AaB) 

X 2 y2-[(Xx 2 .p 2 (x2,y2)) ° z] : Xy 1 .p(x 1 ,yi) : (B -> (A A B)) 

If we call the above term t 2 : t\ : (B > {A A B)), 

then t 2 o 2 7t 2 p 2 (m 2 j w 2 ) : t\ o 7T 1 p(ui, Wi) : (A A B) reduces to both: 

( 1 ) t 2 ['K 2 iP 2 {u2,W2)/y2\ ■■ t 1 ['Kip(u 1 ,wi)/yi] : (AAB) and 

( 2 ) t 2 ° 2 U2 : t\ o ui : (A A B), 

which further reduces to 

( 3 ) *2 [^2/2/2] : ti[u\/y\] : {A A B). 

And, there’s no way to reduce ( 1 ) to ( 3 ), since 7r 2 p 2 (u 2 , w 2 ) isn’t properly em- 
bedded in < 2 [7TiP 2 (m2, w 2 )/y 2 ]. 
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3 Strong Normalization of A°°-Terms 

Definition 5. (a) A reduction sequence is any sequence of terms t 0 , . . . , t n . . . . 
such that , for 0 < j < n, tj tj+ 1 - (b) A term t is strongly normalizing 

(t £ SN ) if every possible reduction sequence for t is finite, (c) If t is strongly 
normalizing, v{t) = the least n such that every reduction sequence for t termi- 
nates in fewer than n steps. 

We now show that all terms of A°° are strongly normalizing. The proof 
follows the method of Tait (cf . fZj ) • This method defines a class of reducible terms 
by induction on type. We are concerned with types, but will ignore variations in 
type, up to a point. 

Definition 6. More precisely, a term t is n-cushioned of type A if t has type 
t n - 1 ti : A, where the ti are A 00 -terms, for 1 < i < n — 1. Note that 

this is a weaker requirement than Iev(t) = n. Indeed, £ev(t) = n implies t is 
k-cushioned, for k < n, and it could also be that t is m-cushioned, for some 
m > n. 

For the proof, we’ll define sets RED B of reducible terms for n-cushioned terms 
of type T, by induction on T. We then prove some properties of these sets of 
terms, including the property that all terms in the sets are strongly normalizing. 
Finally, strong normalization of the calculus is established by showing that all 
A°°-terms are in a set RED B . 

Definition 7. IfT is a type, we define the depth of T , \T\, inductively: 

\p\ = 0, for p an atomic proposition 
|A A S| = \A—*B\ = max{\A\, |S|) + 1 
\u : A\ = \A\ + 1 

Definition 8. The sets RED B of reducible n-cushioned terms of type T are de- 
fined by induction on \T\: 

1. For t of atomic type p, t £ REDp ift is strongly normalizing. 

2. For t of type A —)■ B, t £ RED^_ >B if, for all s £ RED^, t o n s £ RED B . 

3. For t of type A 0 A A lt t £ RED’| oAAi if -nft £ RED^., fori = 0,1. 

4- For t of type u : A, t £ RED™. A if if n t £ RED A and t £ RED A +1 

Proposition 5. Fort a well-typed term, ift is strongly normalizing, then J J."i 
is too. 

Proof. If t is not of the form ft' 11 to, then every reduction sequence for JJ."t has 
less or equal to v(t) terms in it. If t is of the form then every reduction 

sequence for J f n t has length < v(t) + 1. 

Definition 9. A term t is neutral ift is not of the form A n x.to, p”(to,ti), or 

rt 0 . 
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Proposition 6. For T any type, all terms t £ REDJ have the following proper- 
ties: 

(CR 1) If t £ REDJ, then t is strongly normalizing. 

(CR 2) If t £ REDJ and tFt', then t' £ REDJ,, for 

t. : u n - 1 : u n - 2 : ... :ui :TFt’ : u' n _ 1 : u' n _ 2 : . . . : u[ : V 
(CR 3) If t is neutral, and t' £ REDJ for all t' such that t >-i t' , then t £ REDJ 
(CR 4) If t £ REDJ, then t £ REDjJ T , for t : u n -\ : ii n _ 2 u\ : T . 

Proof. Proof by induction on |T|. 

Step 1 T is atomic, i.e. T = p. 

(CR 1) Trivial. 

(CR 2) If t is SN, and t F t' , then t! is SN, so t' is reducible. 

(CR 3) Let M = max{v(t') \ t >~i t'). There are finitely many t' such that 
t >-1 t' , so M is finite. And, v(t) < M + 1. 

(CR 4 ) t £ RED™ yields t is SN thus if n ~ l t is SN, by above proposition. 
Therefore, \f n ~ 1 t £ RED™ -1 and t £ RED™^ 1 
Step 2 T = (A-> B). 

(CR 1) Let a; be a variable of type A. Then to n x £ REDJ, and by induction, 
to n x is SN. Since every reduction sequence for t is embedded in a reduc- 
tion sequence of to n x, t, is SN. This argument works since tFt' implies 
that the subterm occurrence being contracted is properly embedded in 
t, which means this subterm occurrence must have lev < n + 1. Under 
those conditions tFt' implies to n x F t'o n x. 

(CR 2) Let tFt' for t : u n - 1 : u n - 2 : . . . : u\ : T, and take s £ REDJ for 
s : w n -i : w n -2 Wi : A. Then to n s £ REDJ, and to n s F t'o n s. By 
induction, t'o n s is reducible. Since s was arbitrary, t' is reducible. 

(CR 3) Let t be neutral, and suppose that t' is reducible, for all t' such that 
t ^1 t' . Let s £ REDJ. By induction, s is SN. We prove, by induction 
on 1 y(s) that if t o n s >~i ( t o n s) ' , ( t o n s)' is reducible. There are two 
possibilities for (t o n s)': 

1. [t o n s )' is t' o n s, with t F 1 t' . Then t' is reducible, so t' o n s is. 

2. ( t o n a)' is t o" s', with s s' . Then s' is reducible, so t, o n s' is. 

By the induction hypothesis, t o n s is reducible, because it’s neutral. 
Thus, t is reducible. 

(CR 4) We need to show that £ REDjT 1 ^ . Let r £ REDj -1 . By 

induction, r is SN; and by (CR 1), t is SN. Thus, by the above 
proposition, \f n ~ l t is SN, and we can induct on v[r) + f n ^ l t). Base: 
v{r) + n(iy l ~ 1 t) = 0 . Then ]f n ~ 1 to n ~ 1 r is normal (and it is neutral), so 
by induction, ]f n ~ 1 to n ~ 1 r £ REDj -1 . Induction: in one step, o n-1 r 

reduces to 

• (-ll-"~ 1 t) / o n_1 r - reducible by induction 

• o™^ 1 r' - reducible by induction. 

Hence, by induction hypothesis (CR 3), J J" -1 f o n_1 r £ REDj -1 for all 
r £ REDj -1 . Thus, J f n ~ 1 t. £ REDjI^, and t £ REDjJ 1 ^^. 

Step 3 T = {A A B). 
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(CR 1)— (CR 3) Similar to the previous step. Note that t ^ if yields tv ft y 
7r”t', for t : u„_i : u n - 2 : . . . : u\ : (A 0 A A{). 

(CR 4) By (CR 1), t is SN. We need to show that £ RED a “ aAi , i.e. 

that 7r” -1 (JJ." -1 t) £ RED^” 1 , for i = 0,1. By induction on v{]f n ~ l t). Base 
i/(J = 0. Then 7r" _1 (-lJ." _1 f) is both neutral and normal. By I.H. 
(CR 3), 7r” _1 (JJ." _1 t) £ RED^” 1 . Induction: in one step 7r" _1 (J > l." _1 t) 

reduces to 7r" _1 (-U- n_1 0 which is reducible, by I.H.. 

Step 4 T = (u : A). 

(CR 1) t £ RED". A yields t £ RED A +1 , thus, by I.H., t is SN. 

(CR 2) Let t y V. By I.H. (CR 2), t' £ RED^t 1 . By I.H. (CR 4 ), tf £ 

red ”':A'- 

(CR 3) Suppose tf £ RED”,. A , for all tf such that t >-i If. Then tf £ RED A t 
for all t! such that t >-i tf . By I.H. (CR 3), t £ RED A +I . By I.H. (CR 
4 ), t £ RED” A . 

(CR 4) Let t £ RED” A . By (CR 1), t is SN. Thus, by Proposition 5, JJ-" 1 t 
is SN. We need to show that if n ~ l t £ RED".^ 1 , i.e. that £ RED A and 

£ RED^ -1 . Both are easily shown by induction on v(if n ~ 1 t) 
and on j'(JJ-"~ 1 JJ- n ~ 1 t), respectively, by using (CR 3) inductively. 

Proposition 7 . 1 . If, for all u £ RED A , t[u/x\ £ RED^, then A n x.t £ RED’^^. 

2. If to, t\ are reducible, then p n (to, ti) is reducible. 

3. If t£ RED” A , then Vt £ REDf u:u:A . 

4- If t £ RED a and m < n, then f | - m t £ RED A +1 . 

Proof. For part (1): Let u £ RED A . We’ll show that (A n x.t) o" u is reducible, by 
induction on v(u) + v{t). (A n x.t) o n uPi: 

— t[u/x\, which is reducible by hypothesis. 

— (A n x.t') o” u with t >-i tf. By induction, this is reducible. 

— (A n x.t) o" u' with u >-i u! . By induction, this is reducible. 

Parts (2) and (3) are established similarly. Part 4 is by induction on |A|. The 
case A is atomic is handled in the obvious way, implications and conjunctions 
are handled by induction on u(f) + i/(r) and v{t) respectively. Finally, the case 
where A is u : B follows from the induction hypothesis. 

Proposition 8. Let t be a term. Suppose the free variables of t are among 
x\,...,Xk, of levels n\, ... ,nk and having types Ui, ... ,Uk- Letui,...,Uk be 
terms with u\ £ RED))) , . . . , Uk £ RED))) . Then the term t[u\/x \, . . . , Uk/xk] (writ- 
ten t[u/x\), is reducible. 

Proof. Step 1 t entered by (Ax). Trivial. 

Step 2 Let t be A n y n -t° n : \ n ~ l y n -\.t° n _ t : ... : \y\.t\ : ( A -» B). By I.H., 
t 0 n [ v n/y r i. u/x] is reducible (i.e., is in some reducibility class) for all v n £ RED™, 
where m is the level of y n , and T is y n ' s type (so y n has type r m _i : \ r\ : T , 

and m > n). By (CR 4), and the definition of the classes RED^.^, this implies 
that tff[v n /y n ,u/x\ is reducible for all v n £ RED A . Now, using (CR 4) and the 
definition again, t^[v n /y n ,u/x] is reducible implies that [v n /y n , u/x] £ RED)). 
By Proposition 7, A n y n .t^[u/x](= t[u/x\) is reducible. 
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Step 3 The other six cases are handled similarly, using either the definition of 
the reducibility classes or Proposition 7. 

As a corollary, we have: 

Theorem 2. All well-defined terms of A°° are strongly normalizable. 

4 Conclusion 

1. Proof polynomials from the Logic of Proofs (Q) represent the reflective idea 
in the combinatory terms format, with many postulated constant terms and 
three basic operations only. Such an approach turned out to be fruitful in proof 
theory, modal and epistemic logics, logics of knowledge, etc. Proof polynomials 
are capable of extracting explicit witnesses from any modal derivation and thus 
may be regarded as a logical basis for the quantitative theory of knowledge. On 
the other hand, such areas as typed programming languages, theorem provers, 
verification systems, etc., use the language of A-terms rather than combinatory 
terms. Technically speaking, in the A format there are no postulated constants, 
but there are many (in our case infinitely many) operations on terms. Those 
operations create certain redundancies in terms, and the theory of eliminating 
those redundancies (i.e. the normalization process) plays an important role. Re- 
flective A-calculus A°° is a universal superstructure over formal languages based 
on types and A-terms. It offers a much richer collection of types, in particular the 
ones capable of encoding computations in types. We wish to think that a wide 
range of systems based on types and terms could make use of such a capability. 

2. The very idea of reflective terms is native to both provability and com- 
putability. The extension of the Curry-Howard isomorphism offered by the re- 
flective A-calculus A 00 captures reflexivity in a uniform abstract way. This opens 
a possibility of finding more mutual connections between proofs and programs. 
In this paper we have built strongly normalizing reflective A-terms that corre- 
spond to some proper subclass of proof polynomials. It is pivotal to study exact 
relations between A 00 , Intuitionistic Modal Logic and the Intuitionistic Logic 
of Proofs. Such an investigation could bring important new rules to A°°, and 
eventually to programming languages. 

3. Reflective A-calculus is capable of internalizing its own reductions as well 
as its own derivations. If a term t ' : F' is obtained from t : F by a reduction, then 
the corresponding derivation (well-formed term) s :t: F can be transformed to 
a term s' :t' :F'. This takes care of the cases when, for example, an abstraction 
term is obtained not by the abstraction rule, but by a projection reduction. Such 
a step can be internalized in A°° as well. 

4. A more concise equivalent axiomatization of A°° could probably be ob- 
tained by formalizing Internalization as an atomic rule of inference rather than 
a derived rule. This route is worth pursuing. 

5. There are many natural systems of reductions for reflective A-terms. For 
this paper we have picked one with well-embedded redexes on the basis of the 



Reflective A-Calculus 



37 



proof theoretical semantics. We conjecture that the system with unlimited re- 
ductions enjoys both strong normalization and confluence. 
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Abstract. Every instance of the schema of identity is derivable from 
identity axioms. However, from a proof theoretic perspective the schema 
of identity is quite powerful. Even one instance of it can be used to prove 
formulas uniformly where no fixed number of instances of identity axioms 
is sufficient for this purpose. 



1 Introduction 

In this paper we investigate the impact of replacing instances of the schema of 
identity by instances of identity axioms in valid Herbrand disjunctions. More 
precisely, we consider valid sequents of form 

E,s = t D (B(s) D B(t )) b 3ab4(7' p , x) 

where E consists of identity axioms, A is quantifier free, and r p is some “pa- 
rameter term”, for which the index p can be thought of as its size. Obviously 
s = t D ( B(s ) Z> B(t)) can be derived from identity axioms and thus be replaced 
(in the sequent above) by appropriate instances of identity axioms. We compare 
the number of formulas in the corresponding Herbrand disjunctions 

E',s = tD ( B(s ) D B(t)) b A{r p ,t 1 ),...,A(r p 1 t n ) 



and 

E', II b A(r p ,tT), , A(r p ,tH) 

respectively, where E ' , 77 consist of instances of identity axioms. On the one 
hand we provide examples where |77| (i.e. , the number of formulas in 77) cannot 
be uniformly bounded (i.e., depends on the size of the parameter term r p ). On 
the other hand we show that if the parameter term is composed of a “fanning 
out” function symbol then such bounds exist. (We call a unary function symbol 
/ fanning out if f m (x) yf f n (x) is provable whenever m/n, for all m, n £ N.) 

The length of a Herbrand disjunction is uniformly bounded by the length 
of a proof and the logical complexity of the end sequent. (This follows, e.g., 
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from Gentzen’s Midsequent Theorem; see, e.g., [Takeuti, G., 1980| .) Therefore 
our results can be interpreted as statements on the short provability of formulas 
involving complex terms using the schema of identity, i.e. its proof theoretic 
strength in the literal sense. In general, the short (uniform) provability is lost 
when the schema of identity is replaced by identity axioms. 

2 Basic Notions 

Syntax 

Terms are built up, as usual, from constant and function symbols. In particu- 
lar, we will use 0 and c as constant symbols, s and f as unary function sym- 
bols, and + and o as binary function symbols for which we use infix notation. 
Atoms (atomic formulas) consist of an n-ary predicate symbol (n > 1) applied 
to n terms (arguments). Throughout the paper we refer to the binary pred- 
icate symbol =, for which we use infix notation, as “identity”. Formulas are 
either atoms or built up from atoms using the usual connectives (->, A, V, D) and 
quantifiers (V, 3). A"=i ^ i abbreviates A\ A . . . A A n . ->s = t is written as s/t. 
Tuples of variables or terms are indicated by over-lines: e.g., 3xA(x) abbreviates 
3xi . . . 3x n A(xi , . . . , x n ) for some n > 1. By ||A|| we denote the number of atoms 
in a formula A. 

By an expression we mean either a term or a formula. It is convenient to 
view an expression as a rooted and ordered tree, where the inner nodes are 
function symbols (the root possibly is a predicate symbol) and the leaf nodes 
are constants or variables. A position p in an expression A is a sequence of 
edges in a path leading from the root to a node n p of A. Two positions are 
overlapping if one is a proper subsequence of the other. In writing A(ti, . . . ,t n ) 
we implicitly refer, for each i £ {1, . . . , n}, to some — possibly empty — set 7 r, 
of positions in the expression A in which ti occurs, where no two positions in 
Ui<i<n 71 i are overlapping. The indicated positions will be made explicit only 
when necessary. In reference to A(t\, . . . , t n ) we denote the result of replacing 
ti at the corresponding positions 7r, with the term s t by A(si, . . . , s„). (Note 
that not necessarily all occurrences of ti in A are indicated.) g[s] indicates that 
the term s occurs as a proper subterm — at certain positions pi, ■ ■ . p n , n > 1 
— in g. In reference to g[s] we denote by g[t] the result of replacing s at these 
positions with the term t. 

“=” denotes syntactical identity between expressions. For any term t and 
unary function symbol / we define: f°(t) = t and f n+1 (t) = /"(/(f)). Similarly 
if 0 is a binary function symbol (for which we use infix notation) we define: 
M© = t and [i]£ +1 = (*©[*]§)• 

A variable is called fresh with respect to some context (e.g., a sequent) if it 
does not occur elsewhere in that context. By a fresh copy F' of a formula F we 
mean a variable renaming copy of F , such that the variables occurring in F' do 
not occur in the context. 

(Variable) substitutions are functions from variables to terms that are ho- 
momorphically extended to expressions and sequences of formulas as usual. The 
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result of applying the substitution a to an expression E is written as a (E). 
We assume familiarity with the concept of a most general (simultaneous) uni- 
fier of a finite set {(E 1: Fi), . . . ,(E n , F n )} of pairs of expressions (see, e.g., 
[[Leitsch, A., 10071 ). 



Axiomatizing Identity 

The context of our investigation is pure, classical first-order logic. However we are 
mainly interested in the effects of different treatments of the identity predicate. 
We will frequently refer to the following list of axioms, called identity axioms : 

— Reflexivity: x = x 

— Symmetry: x = y D y = x 

— Transitivity: (x = y Ay = z) Di = z 

— Congruence (functions): for every ?r-ary function symbol / and every 
1 <i<n: Xi = yi D f(x i,...,x n ) = f(x 1 ,...,x i - 1 ,y i ,x i+ll ...x n ) 

— Congruence (predicates): for every n-ary predicate symbol P and every 
1 <i<n: Xi=yi D (P( x 1: ...,x n ) D P(a;i, . . . , Xi-i, y i: x i+ \, . . . x n )) 



The Classical Sequent Calculus LK 

As underlying deduction system we will use Gentzen’s (in every sense of the 
word) classical LK. This, however, is only a matter of taste: our results 
can be easily translated to other formats of deduction. We will not have to 
present any formal LK-derivations in detail here and thus simply refer to, e.g., 
|l'akeuti, G., 198l)| for formal definitions of sequents, rules, derivations etc. Se- 
quents are written as P h A, where P and A are sequences of formulas. By |P| 
we denote the number of occurrences of formulas in P. 

All tautological, i.e. , propositionally valid sequents are (cut-free) derivable in 
a number of steps depending only on the logical complexity of the formulas in 
the end sequent. 

Gentzen’s Midsequent Theorem provides a version of Herbrand’s Theorem 
for sequents (see [Takeuti, G., 19801 ). 

Note that the sequents in an LK-derivation can be skolemized without in- 
crease of proof length and that Herbrancl disjunctions can be embedded in “re- 
skolemized” derivations whose length does not depend on the term structure 
(cf. |Baaz, M. and Leitsch, A., 1994| , jBaaz, 1V1. and Leitsch, A., 1999| .) Conse- 
quently the results below can be re-interpreted for quantified formulas. 



3 One Application of the Schema of Identity 

The schema of identity is given by 



x = yD (A(a;) D A(y)) 



(ID) 
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where A is a formula and x,y are variables. The schema 

x = yDt(x) = t(y) (ID=) 

can be seen as a special case of ID (where A(z) = t(x ) = t(z)) combined with an 
appropriate instance of the axiom of reflexivity. More generally, we show that a 
single instance of ID is equivalent to some finite number of instances of ID = that 
involve the same pair of terms. (By an instance of a schema we mean a formula 
of the given form, where the variables have been replaced by terms.) 

Proposition 1. 

(a) For every conjunction /\” = i {s = t D ?y(s) = r*j(f)) of instances o/ID = there 
is a tautological sequent 

n 

r,s = t D (A(s) D A(t)) h /\(s = tD Vi(s) = Ti(t)) 

i = 1 

where r consists of at most n instances of the reflexivity axiom and ||A|| < 2 n. 

(b) For every instance s = tD (A(s) D A(t)) of ID there is a tautological sequent 

r,s = tD ri(s) = ri(t), . . . , s = t D r n (s) = r n (t) b s = t D (A(s) D Aft )) 

where r consists of instances of congruence axioms for predicate symbols and 
|P| < a||A||, where a is the maximal arity of predicate symbols occurring in A. 

Proof, (a) Let 

n 

A(t) = f\ (s = t D rfls) = rflt)) 

i = 1 

and 

n 

A(s) = /\(s = s D rfls) = rfls)) 

2=1 

Let the instances r of the reflexivity axiom be r\(s) — ri(t), . . . ,r n (s) = r n (t). 
Then one can easily derive the sequent T, s = t D (A(s) D A(f)) b A(t), q.e.cl. 

(b) Let r*i (s), . . . , r n (s) be all terms that occur as arguments of predicates 
(atoms) in A(s). Using appropriate instances of the congruence axioms we obtain 
a proof of 

r b n(s) =ri(t) D (. . . D (r„(s) = r n (t) D (A(s) D A(t)) . . .) 

^ v y 

R 

where |T| is bounded by the number of argument positions in A, i.e., \F\ < a|| A||, 
where a is the maximal arity of predicates occurring in A. Note that 

s = t D ri(s) =n(t), . . . ,s = t D r n (s) =r n (t),R h s = t D ( A(s ) D A(t)) 



is tautological. Using the cut rule we thus obtain a proof of the sequent, as 
required. ■ 
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The following theorem (together with Propositions |2J) shows that a single ap- 
plication of the schema of identity can replace an unbounded number of instances 
of identity axioms. (Remember that, by Proposition [T| the two instances of ID = 
correspond to a single instance of ID in presence of some additional instances of 
congruence axioms.) 

Theorem 1. Let A(u, x, y) = (( x + y = y D a; = 0)A0 + 0 = 0) D -u = 0. (We 
refer to all positions at which the variables u , x , y occur.) For all p £ N there are 
terms ft, ft, ft, ft suc -h that the sequent 

0 + 0 = 0 D ft(0 + 0) = ft(0), 0 + 0 = 0 D ft (0 + 0) = ft (0 ), T b A([0ft, ft, ft) 

is tautological, where T is an instance of the transitivity axiom. 

Proof. Let 

ftC 0 + 0) ee [0ft + ([Oft" 1 + . . . + ([0ft + 0) . . .) 

where “(0 + 0)” in “ft( 0 + 0)” refers to all occurrences of subterms of this form. 
Let 

ft(0 + 0) ee [0ft -1 + (... + ([0ft + (0 + 0) . . .) 

where “(0 + 0)” in “ft (0 + 0)” refers only to the rightmost (i.e. , the underlined) 
occurrence of 0 + 0. Note that ft(0 + 0) = ft(0). The needed instance T of the 
transitivity axiom is 

(ft (0 + 0) = ft (0) A ft(0 + 0) = ft (0)) d ft(0 + 0) = ft(0). 

Let 

ft = [0]ft 

and 

ft — [o]+ _1 + (•■■ + ([oft + 0) . . .). 

Note that ft = ft( 0) and ft + ft = rftO + 0); t p is identical to the parameter 
term and the conclusion of the instance of the transitivity axiom is identical to 
instance of the premise in x + y = y D x = 0. Consequently the sequent is of form 

NDBi, N D B 2 , (Ri A B 2 ) d B b ((B DC) AN) DC 

and therefore tautological. ■ 



Remark 1. A variant of this statement for number theory has been proven in 
IIYukami, T . . 19841 . See also |Baaz, M. and Pudlak, P. : 1993| . 

Remark 2. Note that only few properties of “+” and “0” are used. In particular, 
one may think of “0” as the neutral element of an arbitrary group (w.r.t. +). 
From this observation it follows that the implication x + y = y D x = 0 can be 
replaced by pure identities in principle. 
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As mentioned above, each instance of the schema of identity is provable 
from identity axioms. However, the instances of ID = in the tautological sequent 
of Theorem ^ cannot be replaced by a number of instances of identity axioms 
that is independent of the “size” p of the parameter term [0](L, as the following 
proposition shows. 

Proposition 2. Let r p b A p be tautological, where r p consists of instances 
of identity axioms, and A p consists of instances of ^([0]^, x, y) (for A as in 
Theorem m ■ iTpl + | Zip | cannot be uniformly bounded by a constant. 

Proof. Indirect. Assume that there are constant numbers a and b such that for 
all p we have |Tp| = a and |Z\ p | = b. We look at a term minimal generalization of 
the sequent T p b A p . I.e., we replace the formulas in A p by formulas A(u, Xi,yf), 
where the Xj : , y, (1 < i < b) are pairwise distinct variables distinct also from u. 
Moreover, we replace each formula in T p by a fresh copy of the identity axiom 
Ij of which it is an instance. The resulting sequent 

h, ■ ■ ■ ,I a b A(u,xi,yi), . . ,,A(u,x b ,y b ) 

is no longer tautological. However, by simultaneously unifying all those pairs of 
occurrences of atoms (s = t, s' = t') for which the corresponding atoms in r p b A p 
are identical, we regain a tautological sequent 

<t(/i), . . . cr(/ a ) b A(a(u),a(xi),a(yi)), ..., A( y a(u),(j(x b ),a{y b )) 

where a is the most general unifier. This sequent is independent of p, 
but the term [0]^_ is an instance of a(u). Therefore cr(u) must be of form 
0 + (. . . (0 + z) . . .) where z is a variable. However, this implies that the sequent 
cannot be tautological. (Standard arithmetic provides a counter example.) ■ 



Remark 3. The above statement can be strengthened by allowing also instances 
of valid universal number theoretic formulas in r p . 

4 Monadic Parameter Terms 

Looking at the formulation of Theorem Q] one might be seduced to believe that 
it is essential that the parameter term [0] + is built up using a binary function 
symbol. However an analogous fact also holds for monadic parameter terms: 

Theorem 2. Let A(u , x, y) = (x + y = y D x = 0) A f (x) = x) D u = 0) . (We 
refer to all positions at which the variables u, x, y occur.) For all p € N there are 
terms r^r^t 1 ),^ such that the sequent 



f (0) = 0 D r? (f (0)) = r?(0), 0 + 0 = 0 3 r p 2 ( 0 + 0) = r£(0), T b A(P(0), tf , t p ) 
is tautological, where T is an instance of the transitivity axiom. 
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Proof. Let 

i\{z) = P-\z) + (... + (2 + 0)...). 

Therefore 



r i(f(0)) = f p (0) + (... + (f(0) + 0) . . .) 

and 

rf(0) = P -1 (0) + (... + (0 + 0)...) 



Note that this implies that also the leftmost formula of the sequent is an instance 
of ID = . Let 



r 2 ( z ) = f p_1 (0) + (... + (f(0) + z . . .)). 



Setting 



t\ = P(0) 



and 

^2 — f p_1 (o) + (... + (f (0) + 0) . . .) 



it is straightforward to check that the sequent is tautological (analogously to 
the proof of Theorem Q1 ) ■ 



Remark 4- Note that in the proof above two instances of the schema of identity 
are used. (See also the final remark, at the end of the paper.) 

Again, one can show that the two instances of ID = cannot be replaced by a 
uniformly bounded number of instances of identity axioms: 

Proposition 3. Let r p P A p be tautological , where r p consists of instances 
of identity axioms and A p consists of instances of A(f p (0), x, y) (for A as in 
Theorem W- l-Tpl + \A p \ cannot be uniformly bounded by a constant. 

Proof. Analogous to the proof of Proposition Q 

5 Cases Where the Schema of Identity Is “Weak” 

Definition 1. We call a unary function symbol s fanning out if there is a set 
of axioms NFP S such that for all m ^ n the formula s m (:r) ^s n (x) is provable 
from NFP S and the identity axioms. 

We show that — in contrast to the results in the last section — “short” 
proofs using one instance of the schema of identity can be transformed into 
“short” proofs using only identity axioms if the parameter term is composed of 
a fanning out function symbol. 
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Theorem 3. Let the propositional sequent ( H ld ) 

E,s = tDr 1 {s) = r 1 (t),...,s = tDr k {s) = r k (t)\~ A(s p (0),t 1 ),...,A(s p (0),t n ) 

be provable in LK, where E consists of instances of identity axioms, and k as 
well as | S' | are independent of the size p of the parameter term s p (0). 

If s is fanning out than the instances of ID = are redundant in the following 
sense: A sequent 



E', Q h ^( 0 ),^), . . . , A(sV(0),t' r ) ( H ° ) 

is provable in LK, where r < 2 n, H is a (possibly empty) sequence of instances 
axioms in NFP S and E is a sequence of instances of identity axioms and both 
|Sj and \Q\ are independent of the size p of the parameter term s p (0). 

Proof. Case 1: ,s = t. In this case the instances of ID = in ( H ld ) clearly can 
be replaced by the instances ?r(s) = r’i(s), . . . , rfc(s) = rfc(s) of the rcflexivity 
axiom to obtain the required Herbrand disjunction ( H ° ). 

Case 2: s = s fl (0) and t = s^ 2 (0), i\ ^ 1 2 . Since 

s ^ t P s = t D u(s) = ri(t) 

is a tautology we can transform the derivation of (H ld ) into one of 

E,s^t\- A(s p (0), if A(s p (0),t)() (1) 

By the form of s and t and the fact that s is fanning out s^t is derivable 
from NFP S and identity axioms. In other words, there exists a tautological 
sequent 



E.II A(s p (0),t 1 ), . . . , A(s p (0),t n ) 

where II consists of instances of axioms in NFP S and identity axioms. How- 
ever, in general, the size of s and t will depend on the depth p of the parame- 
ter, which implies that |7T| is dependent on p as well. Therefore we transform 
sequent (1) into another tautological sequent 

E',s'^t'\-A( s p (0),t[),...,A(s p (0),O 

where the sizes of terms s' and t' are independent of p. To this end we replace 
in (1) all occurrences of s p (0) by a fresh variable u, all indicated occurrences 
of terms in ti, ... ,t n by pairwise distinct and new variables, each formula in 
E by a fresh copy of the identity axiom of which it is an instance, as well as 
s ^ t by v ^ w, where v, w are fresh variables as well. Of course, the resulting 
sequent 



^°,v^w P A(u,x 1 ), . . . ,A{u,x n ) 



(2) 
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is not tautological in general. However, by simultaneously unifying in (2) 
all those pairs of occurrences of atoms (P°, Q°) for which the corresponding 
atoms P and Q in (1) are identical, we regain a tautological sequent of form 

, s fcl («i) ± s k2 M b A{ s k * (z),t*),...,A(s k * (, z ) , t*) (3) 

where each of u\, u 2 and z is a either variable or the constant 0. Moreover, 
s is an instance of s fcl (iti), t an instance of s fc2 (u 2 ) and the parameter s p (0) 
an instance of s k3 (z). Observe that — if we use the most general unifier 
to obtain (3) from (2) — Ay , fc 2 and ^3 are independent of p. W.l.o.g. we 
assume that Ay < A’ 2 . We can also assume that 2 is a variable, since otherwise 
s ks (z) = s p (0) and therefore nothing is left to prove. 

To reconstruct the parameter at the appropriate places — i.e. , to regain the 
required Herbrand disjunction of 3a;(H(s p (0), x)) at the right hand side of 
the sequent — we define a variable substitution a that is to be applied to 
the whole sequent (3). 

Case 2.1: z and u 2 ^ z. For i = 1,2, if u, is a variable then set either 

a(m ) = 0 or cr(ui) = s(0) in such a way that er(s fel ( Ul )) ^ cr(s fc2 (u 2 )). 
Case 2.2: u\ = z and 112 = z. a remains empty. (Observe that in this case 
Ay yf k 2 ). 

Case 2.3: m = z and u 2 = 0. We set a(z) = s k2 ~ kl+1 (z). 

Case 2.4: U\ = z and u 2 is a (different) variable. We set a{z) = 
s k2 ~ kl+1 (u 2 ). 

Case 2.5: u 2 = z and tq = 0. We set cr(z) = s( 2 ). 

Case 2.6: u 2 = 2 and U\ is a (different) variable. We set a(z) = s(m). 
Note that, in all cases, the depth of a(s kl (m) yf s fc2 (u 2 )) is independent of p. 
Moreover cr(s fcl (ui)) ^ <r(s fc2 (w 2 )) which implies that a sequent 

£ h cr(s fcl ( Ul )^s fc2 (rt 2 )) (4) 

is derivable where £ consists of a number of instances of axioms in NFP S 
and identity axioms that is independent of p. Using (4) and the result of 
applying a to (3) we obtain 

r b H(s p ' ( 2 ) , U) , . . . , A(s p ' ( 2 ) , tn) 

where £ consists of instances from NFP S and identity axioms and \£\ is 
independent from p. Since p' < p it remains to instantiate s p ~ p (0) for 2 to 
obtain the required Herbrand disjunction ( H ° ). 

Case 3: Either s ^ s e (0) or t ^ s^(0) for some £ > 0. W.l.o.g., we assume t is 
not of this form. 

Since 

sj£t\- s = t D r 1 (s) = r 1 (t), . . . ,s = t D r k (s) = r k (t) 
is a tautological sequent also 

s yf t £ b A(s p (0) , h ) , . . . , H(s p (0) , r n ) (5) 
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is tautological. It thus remains to show that we can also derive a sequent 

s = t,E' b A(s p (0),t' 1 ),...,A(s p (0),t'„) (6) 

where E 1 consists of axiom instances and |.= , | does not depend on p. We can 
then join (5) and (6) — by applying the cut rule — to derive the required 
Herbrand disjunction (77°) (which will have 2 n disjuncts). 

Observe that possibly either s = f[s] or t = s[£]. However, w.l.o.g., we may 
assume that t ^ s[f] (i.e., t does not occur in s, but possibly vice versa). We 
begin by replacing in (H ld ) all occurrences of t by s. Since new occurrences 
of t might arise we repeat this step until no occurrences of t are left in the 
sequent. Let us denote the result by 

{S} 4 ;, 5 = a D {n( S )}r = {ri(t)}‘,*, . . . ,a = = O K( S )}** = {r k (t)Y; 

Since {ri(s)}‘* = {fi (£)}** the instances of ID = can be replaced by k in- 
stances of the reflexivity axiom. We denote this sequence of k axiom instances 
by E' and obtain 

{ S' b {A(sP(0),H)}r, • • • , {A(s^(0),Q}‘* (7) 

Observe that although, by assumption, t is not of form s^(0) it might properly 
contain the parameter s p (0) as subterm. Thus in (7) the parameter might 
have been replaced; moreover A(., .) ^ {A(.,.)}** in general. Similarly the 
iterated replacement of t by s might result in members of {S}** that are 
not longer instances of identity axioms. It thus remains to reconstruct the 
required Herbrand disjunction relative to axioms by re-substituting t for 
some occurrences of s in (7). For this we have to use s = t. 

We can derive in LK any propositional sequent of form 

s = t 1 II 1 B\-B° (8) 

where 77 consists of instances of identity axioms and Bf is like B except for 
replacing a single occurrence of s in B — say at position 7r in B — by t. 
Clearly, we can bound |77| by |7r|, i.e. the depth of the position n. By setting 
{H(s p (0), ti)}l* or a formula in {S'}** for B and (repeatedly) applying the 
cut rule we can transform the derivation of (5) and derivations of appropriate 
instances of (6) into a derivation of 

s = t, E 1 , n' b A(s p ( 0), ■ ■ • , { tn}T ) (9) 

where E' and II' consist of instances of identity axioms and |S’ , | = |.Ej. 
Since we replaced only occurrences of s in positions that are already present 
in A(., .) and in the identity axioms, respectively, \II'\ does not depend on 
the size of the parameter p. Therefore (9) is the sequent (6) that we have 
been looking for. ■ 
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The above proof is easily adapted to the case where the parameter is com- 
posed using a binary function symbol. 

Definition 2. We call a binary function symbol o fanning out in presence of a 
set of axioms NFP 0 such that for all m ^ n the formula [x]™ ^ [x]" is provable 
from NFP 0 and the identity axioms. 



Corollary 1. Let the propositional sequent 

E,s = tO n(s)=r 1 (t),...,s = t D r k {s)=r k {t) h A([c]%, tf), . . . , A([c]%, tff) 

by provable in LK, where E consists of instances of identity axioms, and k as 
well as | S' | are independent of the size p of the parameter [c] 

If o is fanning out then a sequent 

A([c]v, A([c\Z,¥ r ) 

is provable in LK, where r < 2 n, U is a (possibly empty) sequence of instances 
of the axioms NFP 0 and E is a sequence of instances of identity axioms and both 
| S’! and \Q\ are independent of the size p of the parameter. 

6 Final Remark 

Two instances of the schema of identity (ID) are necessary indeed to obtain the 
“short proofs” of Theorem [5J This is because in presence of arbitrary instances 
of s(x) = x the condition of “fanning out” in Theorem[3| can be replaced by the 
following: 

For every term s and every n there are terms t\,...,t n such that s¥U 
and ti ^ tj is provable for alii, j € {1, . . . , n}, i ^ j. 
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Abstract. We investigate the relative complexity of two different meth- 
ods of cut-elimination in classical first-order logic, namely the methods 
of Gentzen and Tait. We show that the methods are incomparable, in the 
sense that both can give a nonelementary speed-up of the other one. More 
precisely we construct two different sequences of LK-proofs with cuts 
where cut-elimination for one method is elementary and nonelementary 
for the other one. Moreover we show that there is also a nonelementary 
difference in complexity for different deterministic versions of Gentzen’s 
method. 



1 Introduction 

Gentzen’s fundamental paper introduced cut-elimination as a fundamental pro- 
cedure to extract proof theoretic information from given derivations such as 
Herbrand’s Theorem, called Mid-Sequent Theorem in this context. In traditional 
proof theory, the general possibility to extract such informations is stressed, but 
there is less interest in applying the procedures in concrete cases. This, how- 
ever, becomes essential if proof theory is considered as a basis for an automated 
analysis of proofs, which becomes important in connection with the develop- 
ment of effective program solving software for mathematical applications such 
as MATHEMATIC A. 

In this paper we compare the two most prominent cut-elimination proce- 
dures for classical logic: Gentzen’s procedure and Tait’s procedure; we avoid to 
call them ’’algorithms” because of their highly indeterministic aspects. From a 
procedural point of view, they are characterized by their different cut-selection 
rule : Gentzen’s procedure selects a highest cut, while Tait’s procedure selects a 
largest one (w.r.t. the number of connectives and quantifiers). The most impor- 
tant logical feature of Gentzen’s procedure is, that - contrary to Tait’s method 
- it transforms intuitionistic proofs into intuitionistic proofs (within LK) and 
there is no possibility to take into account classical logic when intended. Tait’s 



R. Kahle, P. Schroeder-Heister, and R. Stark (Eds.): PTCS 2001, LNCS 2183, pp. 49-^^ 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 



50 



M. Baaz and A. Leitsch 



procedure, on the other hand, does not change the inner connections of the 
derivation, it replaces cuts by smaller ones without reordering them. 

In this paper, we use the sequence y n of LK-proofs corresponding to Stat- 
man’s worst-case sequence to compare Gentzen’s and Tait’s procedure. The se- 
quence 7 n is transformed twice: first into a sequence where Tait’s method 
speeds up Gentzen’s nonelementarily, and second into a sequence <j> n giving the 
converse effect. As a complexity measure we take the total number of symbol 
ocurrences in reduction sequences of cut-elimination (i.e. all symbol occurrences 
in all proofs occurring during the cut-elimination procedure are measured) . Both 
methods are nondeterministic in nature. But also different deterministic versions 
of one and the same method may differ quite strongly: we show that even two dif- 
ferent deterministic versions of Gentzen’s method differ nonelementarily (w.r.t. 
the total lengths of the corresponding reduction sequences). 

Finally we would like to emphasize that the main goal of this paper is to 
give a comparison of different cut-elimination methods. It is not our intention to 
investigate, at the same time, the efficiency of calculi; for this reason we do not 
work with improved or computationally optimized versions of LK, but rather 
take a version of LK which is quite close to the original one. 

2 Definitions and Notation 

Definition 1 (complexity of formulas). If F is a formula in PL then the 
complexity comp(F) is the number of logical symbols occurring in F. Formally 
we define 

com.p(F) = 0 if F is an atom formula, 

comp(F) = 1 + comp(A) + comp(B) if F = Ao B for o e (A, V, -a}, 
comp(F) = 1 + comp(A) if F = -i A or F = ( Qx)A for Q € {V, 3}. 

Definition 2 (sequent). A sequent is an expression of the form r b A where 
r and A are finite multisets of PL-formidas (i.e. two sequents Tf b A\ and 
F 2 b A 2 are considered equal if the multisets represented by T) and by r 2 are 
equal and those represented by A\,A 2 are also equal). 

Definition 3 (the calculus LK). The initial sequents are A b A for PL- 
formulas A. In the rules of LK we always mark the auxiliary formulas (i.e. 
the formulas in the premis(ses) used for the inference) and the principal (i.e. 
the infered) formula using different marking symbols. Thus, in our definition, 
A - introduction to the right takes the form 

r 1 hA+,A r 2 \-A 2 ,B+ 

r 1 ,r 2 \- a u a/\b*,a 2 

We usually avoid markings by putting the auxiliary formulas at the leftmost, 
position in the antecedent of sequents and in the rightmost position in the con- 
sequent of sequents. The principal formula mostly is identifiable by the context. 
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Thus the rule above will be written as 



r 1 \-A 1 ,A T 2 hA 2 ,B 
F,r 2 i - a 1 ,a 2 ,aab 



Unlike Gentzen’s version of LK (see w ours does not contain any “auto- 
matic’’ contractions (in this paper we do not consider intuitionistic logic). Instead 
we use the additive version of LK as in the book of Girard m combined with 
multiset structure for the sequents (this is exactly the version of LK used in w 
By the definition of sequents over multisets we don’t need the exchange rules. In 
our notation T, A, II and A serve as metavariables for multisets of formulas; h 
is the separation symbol. For a complete list of the rules we refer to w; we only 
give three logical and three structural rules here. 

— The logical rule V -introduction left: 

A, TV- A B,n h A 
A\/B,r,n\-A,A v : 1 



— The logical rules for V -introduction right: 

T\- A. A 



rh A, AV B 
r h a,b 



V : r 1 



V : r2 



rh A, AV B 

— The structural rules weakening left and right: 

rh A rh A 



r\- A, a 



A,rh a 



— The cut rule: 



r h A, A 



a,hi-a 



FIIh A, A 



cut 



An LK- derivation is defined as a directed tree where the nodes are occur- 
rences of sequents and the edges are defined according to the rule applications in 
LK. Let A be the set of sequents occurring at the leaf nodes of an LK-derivation 
if and S be the sequent occurring at the root (called the end-sequent) . Then we 
say that ip is an LK-derivation of S from A (notation A h lk S). If A is a 
set of initial sequents then we call if an LK-proof of S. Note that, in general, 
cut-elimination is only possible in LK-proofs. 

We write 

WO 

S 

to express that if is a proof with end sequent S. 

Paths in an LK-derivation if , connecting sequent occurrences in if, are de- 
fined in the traditional way; a branch in if is a path starting in the end sequent. 
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We use the terms “predecessor” and “successor” in the intuitive sense (i.e. con- 
trary to the direction of edges in the tree): If there exists a path from S± to 
S2 then S2 is called a predecessor of A . The successor relation is defined in a 
analogous way. E.g. every initial sequent is a predecessor of the end sequent. 

Definition 4. The length of a proof to is defined by the number of symbol oc- 
currences in uj and is denoted by l(u>). 

The famous proof of the cut-elimination property of LK is based on a double 
induction on rank and grade of a modified form of cut, namely the mix. 

Definition 5 (mix). Let T b II and A b A two sequents and A be a formula 
which occurs in II and in A; let II* , A* be 77 , A without occurrences of A. Then 
the rule 

r\- n Ah a . 

r,A*hn*,A mix 

is called a mix on A. Frequently we label the rule by mix(A ) to indicate that the 
mix is on A. 



Definition 6. 



Let <f be an LK -proof and if be a subderivation of the form 



(Vh) (^2) 
A h Ax r 2 h A 2 
A, A* b ai,a 2 



mix(A) 



Then we call if a mix-derivation in <f>; if the mix is a cut we speak about a 
cut- derivation. We define the grade of if as comp(A); the left-rank of if is the 
maximal number of nodes in a branch in if\ s.t. A occurs in the consequent of 
a predecessor of A b A\. If A is ’’produced” in the last inference of if 1 then the 
left-rank of if is 1. The right-rank is defined in an analogous way. The rank of 
if is the sum of right-rank and left-rank. 

The cut-elimination method of Gentzen can be formalized as a reduction 
method consisting of rank- and grade reductions on LK-proofs. But also Tait’s 
method can be defined in this way, but with another selection of a cut-derivation 
in the proof. In a slight abuse of language we speak about cut-reduction, even if 
the cuts are actually mixes. 

Definition 7 (cut-reduction rule). In Gentzen’s proof a mix- derivation if is 
selected in an LK -proof <p and replaced by a derivation if' (with the same end- 
sequent.) s.t. the corresponding mix-derivation(s) in if' has either lower grade or 
lower rank than if. These replacements can be interpreted as a reduction rela- 
tion on LK -proofs. Following the lines of Gentzen’s proof of the cut- elimination 
property in & we give a formal definition of the relation > on LK -proofs in the 
Appendix. 

Using > we can define two proof reduction relations, >g for Gentzen reduc- 
tion and >t for Tait reduction. Let (f be a proof and let if be a mix- derivation 
in cf occurring at position A (we write <f = (f[if]\); assume that if > if' . 



